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(57) Abstract: A method of modifying a test NR polypeptide is disclosed. The method can include: providing a test NR polypeptide 
sequence having a characteristic that is targeted for modification; aligning the lest NR polypeptide sequence with at least one refer- 
ence NR polypeptide sequence for which an X-ray structure is available, wherein the at least one reference NR polypeptide sequence 
has a characteristic that is desired for the test NR polypeptide; building a three-dimensional model for the test NR polypeptide using 
the three-dimensional coordinates of the X-ray structure(s) of the at least one reference polypeptide and its sequence alignment with 
the test NR polypeptide sequence; examining the three-dimensional model of the test NR polypeptide for differences with the at least 
one reference polypeptide that are associated with the desired characteristic; and mutating at least one amino acid residue in the test 
NR polypeptide sequence located at a difference identified above to a residue associated with the desired characteristic, whereby 
the test NR polypeptide is modified. An isolated GR polypeptide comprising a mutation in a ligand binding domain, wherein the 
mutation alters the solubility of the ligand binding domain, is also disclosed. An isolated GR polypeptide, or functional portion 
thereof, having one or more mutations comprising a substitution of a hydrophobic amino acid residue by a hydrophilic amino acid 
residue is also disclosed. Representative mutations are F602S and F602D substitutions. Expression of the GR polypeptide in E. coli 
is also provided. A solved three-dimensional crystal structure of a glucocorticord receptor a ligand binding domain polypeptide is 
also disclosed, along with a crystalline form of the glucocorticord receptor a ligand binding domain polypeptide. Methods of design- 
ing modulators of the biological activity of glucocorticoid receptor a and other nuclear receptor, steroid receptor and glucocorticoid 
receptor polypeptides and nuclear receptor, steroid receptor and glucocorticorid receptor ligand binding domain polypeptides are 
also disclosed. 
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CRYSTALLIZED GLUCOCORTICOID RECEPTOR LIGAND BINDING DOMAIN 
POLYPEPTIDE AND SCREENING METHODS EMPLOYING SAME 

Cross Referen ce to Related A pr ii,. 3 .i — 
The present patent application is based on and claims priority to U S 
Prov,s,ona, Application Serial No. 60/305.902, entitled "CRYSTALLIZED 

GLUCOCORTICOID RECEPTOR LIGAND BINDING DOMAIN POLYPEPTIDE 
AND SCREENING METHODS EMPLOYING SAMP- „,„• „ P °, YPEPTIDE 

° =™r-i_uriNG SAME , which was filed July 17, 
2001 and ,s incorporated herein by reference in its entirety. 



Technical Field 

The present invention relates generally to a modified glucocortcoid receptor 
poypepbde, ,o a modified giucocortcoid receptor ligand binding domain 

and to me stouctote o, a glucocorticoid receptor ligand binding domain in complex 
w* a hgand and a co-activator. The invention further relates to methods by 
which a soluble glucocorticoid polypeptide can be generated and by which 
modulators and Ifcands of nudear racepto,*, particularly ste ro id raceptors and 

20 IT TT 9 " JC ° Ster0id re ° eP,0re ' he *" d bM "9 domai " s *— 
*u can be identified. 



Abbreviations 



ATP 


adenosine triphosphate 


ADP 


adenosine diphosphate 


AR 


androgen receptor 


CAT 


chloramphenicol acyltransf erase 


CBP 


CREB binding protein 


cDNA 


complementary DNA 


DBD 


DNA binding domain 


DMSO 


dimethyl sulfoxide 


DNA 


deoxyribonucleic acid 


DTT 


dithiothreitol 


EDTA 


ethylenediaminetetraacetic acid 
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ER estrogen receptor 

GR glucocorticoid receptor 

GRE glucocorticoid responsive element 

GST glutathione S-transferase 

HEPES N-2-Hydroxyethylpiperazine-N-2-ethanesulfonic acid 

HSP heat shock protein 

kDa kilodalton(s) 

LBD ligand binding domain 

MR mineralcorticoid receptor 

NDP nucleotide diphosphate 

NID nuclear receptor interaction domain 

NTP nucleotide triphosphate 

PAGE polyacrylamide gel electrophoresis 

PCR polymerase chain reaction 

pi isoelectric point 

PPAR peroxisome proliferator-activated receptor 

PR progesterone receptor 

RAR retinoid acid receptor 

RXR retinoid X receptor 

SDS sodium dodecyl sulfate 

SDS-PAGE sodium dodecyl sulfate polyacrylamide gel 

electrophoresis 

TIF2 transcription intermediary factor 2 

TR thyroid receptor 

VDR vitamin D receptor 

Amino Acid Abbreviations 
Single-Letter Code Three-Letter Code Name 

A Ala Alanine 

V Val ' Valine 

L Leu Leucine 

I He Isoleucine 

P Pro Proline 
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F 

W 

M 


Phe 


~ nenyiaianine 


Trp 


TrvDtODhan 


Met 


Methionine 


G 


Gly 


Glycine 


S 


Ser 


Serine 


r 


Thr 


Threonine 


c 


Cys 


Cysteine 


Y 


Tyr 


Tyrosine 


N 
Q 


Asn 


AsDaraaine 


Gin 


Glutamine 


D 


Asp 


Mspamc Acid 


E 


Glu 


Glutamic Acid 


K 


Lys 


Lysine 


R 


Arg 


Arg in in e 


H 


His 


Histidine 



Functionally Equivalent Codons 



Amino Acid 






Codons 






Alanine 


Ala 


A 


GCA GCC GCG GCU 


Cysteine 


Cys 


C 


UGC UGU 


Aspartic Acid 


Asp 


D 


GAC GAU 


Glumatic acid 


Glu 


. E 


GAAGAG 


Phenylalanine 


Phe 


F 


- UUCUUU 


Glycine 


Gly 


G 


GGA GGC GGG GGU 


Histidine 


His 


H 


CAC CAU 


Isoleucine 


lie 


1 


AUA AUC AUU 


Lysine 


Lys 


K 


AAA AAG 


Methionine 


Met 


M 


AUG 


Asparagine 


Asn 


N 


AAC AAU 


Proline 


Pro 


P 


CCA CCC CCG CCU 


Glutamine 


Gin 


Q 


CAACAG 


Threonine 


Thr 


T 


ACA ACC ACG ACU 


Valine 


Val 


V 


GUA GUC GUG GUU 
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Tryptophan Trp W UGG 

Tyrosine Tyr Y UAC UAU 

Leucine Leu L UUAUUGCUACUC 

CUG CUU 

Arginine Arg R AGAAGGCGACGC 

CGGCGU 

Serine Ser S ACG AGU UCA UCC 

UCG UCU 



Background Art 

Nuclear receptors reside in either the cytoplasm or nucleus of eukaryotic 
cells and represent a superfamily of proteins that specifically bind a physiologically 
relevant small molecule, such as a hormone or vitamin. As a result of a molecule 
binding to a nuclear receptor, the nuclear receptor changes the ability of a cell to 
transcribe DNA, i.e. nuclear receptors modulate the transcription of DNA. 
However, they can also have transcription independent actions. 

Unlike integral membrane receptors and membrane-associated receptors, 
nuclear receptors reside in either the cytoplasm or nucleus of eukaryotic cells. 
Thus, nuclear receptors comprise a class of intracellular, soluble, ligand-regulated 
transcription factors; Nuclear receptors include but are not limited to receptors for 
androgens, mineralcorticoids, progestins, estrogens, thyroid hormones, vitamin D, 
retinoids, eicosanoids, peroxisome proliferators and, pertinently, glucocorticoids. 
Many nuclear receptors, identified by either sequence homology to known 
receptors (See, e.g. , Drewes et al. , (1996) Mol. Cell. Biol. 16:925-31) or based on 
their affinity for specific DNA binding sites in gene promoters (See, e.g. , Sladek et 
ah, Genes Dev. 4:2353-65), have unascertained ligands and are therefore 
commonly termed "orphan receptors". 

Glucocorticoids are an example of a cellular molecule that has been 
associated with cellular proliferation. Glucocorticoids are known to induce growth 
arrest in the G1-phase of the cell cycle in a variety of cells, both in vivo and in 
vitro, and have been shown to be useful in the treatment of certain cancers. The 
glucocorticoid receptor (GR) belongs to an important class of transcription factors 
that alter the expression of target genes in response to a specific hormone signal. 
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Accumulated evidence Indicates that receptor associated proteins play key roles 
co-purify with the GR Is constantly expanding 

5 ioints tdT?^ "* alS ° ** "*«~y -** on the skin, 

J °f ' ^ tSnd0nS - ^ « N*>rtant for treatment of disomers whets 

T 9 " 10 * ^ * '™ -ten, acuv«y. 
aisorders of this sort include but are nnt limt^ ♦ _u 

not "mrted to rheumatoid arthritic 

rz:r r* d,sease ' • - n 

l*e sys,em,c tupus ertfhmatosus. Glucocorticoids are also used to .eat asthma 
• and are widely used with other dru gs to prevent the rejectton of organ "to 

aT~? ° f 7 ^ (teUkemiaS) "* '~ CV-Phoma ) can 

also respond to corticosteroid drugs. 

mem ^ effe * - *- express receptor* for 

*em They regulate the expression of sav 9ral genes either positively or 

Sg 252 I ^ 98: 267 - 278: < 1985 > 

209-252. Ev^ns, (1988) Scf9nce 

0 989, Carreer « es . 49: 2259S-2265,,. Due In part to their ab Jo 

llam as tT ^ ^ ""' ** *- d " " "» *-"»* - 
i^T ' ^ ^ S0 ' id tUm ° re a " d 0,her *— living 

Zt ^ PS ° riaSiS - inClUSi0 " of *~ i 

chemomerapeu te regimens has contributed to a high ra ,e o, cure of oeriai 

tatam. and lymphomas whfoh wa re formerly ietha, (Homo-Delamhe . (1884, 

effe* after b,nd,ng to meir receptors, the mechanism of oa„ kJtl is no. completoly 
understood aKhough save., hypotheses have been pmposed. Among « 2 

gZh" fo , I eSeS "* ***"*" ° f — and 
■nta. the mduCon of endonuolaaaes; and me induction of a cyclic AMP- 
STrT 1 <M5S52 ^' < 1989 > 8^, 269: 

Vedeote. (1986) Cancels. 46: 2457-2462; Kelso > Munck . (1984) Jt , mmono/ 
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133:784-791; Gruol et al. . (1989) Molec. Endocrinol. 3: 2119-2127; Yuh & 
Thompson , (1 989) J. Biol. Chem. 264: 1 0904-1 091 0). 

Polypeptides, including the glucocorticoid receptor ligand binding domain, 
have a three-dimensional structure determined by the primary amino acid 
5 sequence and the environment surrounding the polypeptide. This three- 
dimensional structure establishes the polypeptide's activity, stability, binding 
affinity, binding specificity, and other biochemical attributes. Thus, knowledge of a 
protein's three-dimensional structure can provide much guidance in designing 
agents that mimic, inhibit, or improve its biological activity. 

10 The three-dimensional structure of a polypeptide can be determined in a 

number of ways. Many of the most precise methods employ X-ray crystallography 
(See, e.g. , Van Holde , (1971) Physical Biochemistry , Prentice-Hall, New Jersey, 
pp. 221-39). This technique relies on the ability of crystalline lattices to diffract X- 
rays or other forms of radiation. Diffraction experiments suitable for determining 

15 the three-dimensional structure of macromolecules typically require high-quality 
crystals. Unfortunately, such crystals have been unavailable for the ligand binding 
domain of a human glucocorticoid receptor, as well as many other proteins of 
interest. Thus, high-quality diffracting crystals of the ligand binding domain of a 
human glucocorticoid receptor in complex with a ligand and a peptide would 

20 greatly assist in the elucidation of its three-dimensional structure. 

Clearly, the solved crystal structure of the ligand binding domain of a 
glucocorticoid receptor polypeptide would be useful in the design of modulators of 
activity mediated by the glucocorticoid receptor. Evaluation of the available 
sequence data shows that GRa is particularly similar to MR, PR and AR. The 

25 GRa LBD has approximately 56%, 54% and 50% sequence identity to the MR, PR 
and AR LBDs, respectively. The GRB amino acid sequence is identical to the 
GRa amino acid sequence for residues 1-727, but the remaining 15 residues in 
GRp show no significant similarity to the remaining 50 residues in GRa. if no X- 
ray structure were available for GRa, then one could build a model for GRa using 

30 the available X-ray structures of PR and/or AR as templates. These theoretical 
models have some utility, but cannot be as accurate as a true X-ray structure, 
such as the X-ray structure disclosed here. Because of their limited accuracy, a 
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mode, for GRa wi„ gene ra „ y be less useft, than an X-ray structure ,or the design 
of agorasts. antagonists and modulators of GRa 

struct T ! e H "*" d GRa - ,,9a " d -^^ orysta. su.ci.re would provWe 
sectoral details and insignte neces ^ „ ^ g £ 

maxtm. preferred for ^ ^ 

By e p,o*„g the sbocfcra, details obtained fiom a GRa-ligand-co-aCator crys^l 

st, ^' , h W 7 bS P0SSib ' e 10 deS ' 9 " 3 GRK m ° dUlator "* 

rit, n s,eroki receptore and nuc,ear ~ ™ - «*» 

10 ST „ °" he " 9and ° Mn ~" °'""™> «*■ A GRa modulator 

STaT r~~-- ** woutd ta ke advantage c, heretofore 
unsown GRa stroctara, considerations and thus be more effect than a 
modulator developed using homology-based design. Potenba, or existent 
^otogy models cannot prov.de the necessary degree of specificity. A GRa 

15 ZZ 7, T usin9 ** " - * ««— *™ - - 

gand b,nd,n g domain of GRa In complex wKh a ligand and a co-acfivator would 
recIZ " ^ deVel0Pment « m0dUla,0rS 0f «*• »*- 

AKhough several journal articles have referred to GR mutants having 
.nc eased „gand efficacy- in cell-based assays, * has no, been menfioned tha 
20 u^tanta could have improved soiufion properties so thatthey could provide! 
suable reagent for purification, assay, and orystafefion. See Garabedian a 
Yamamoto (1992, Bh , Ce „ 3; 1245 . 1257; ^ - ^ 

iZ'^'rT 705 - BOhe " (1 " 5) ^ SfoA ^ 2 ™ »«"»«* Bohen 
25 3330 " 3339; Fre8ra " « *. < 2 °°°> °— Oev. 14: 

Indeed, i ls well documented ^ QR ^ 
*apero„es such as hsp90, hsc70, and P 23). ,„ fte past, i, has been consider 
fitat GR would erther no. be acfive or soluble if purified away from these bindir* 

SSiSoT 273: 13918 ' 13924: Rajapand ' 61 al (2000) ■* « cft - 
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Still other journal articles have reported E.coli expression of GST-GR, but 
also noted a failure to purify the purported polypeptide. See Ohara-Nemoto et al., 
(1990) J. Steroid Biochem. Molec. Biol. 37: 481-490; Caamano et al., (1994) 
Annal. NY Acad. Sci. 746: 68-77. 

5 What is needed, therefore, is a purified, soluble GRa LBD polypeptide for 

use in structural studies, as well as methods for making the same. Such methods 
would also find application in the preparation of modified NRs in general. 

What is also needed is a crystallized form of a GRa ligand binding domain, 
preferably in complex with a ligand and more preferably In complex with a ligand 

10 and a co-activator. Acquisition of crystals of the GRa ligand binding domain 
polypeptide permits the three-dimensional structure of a GRa ligand binding 
domain (LBD) polypeptide to be determined. Knowledge of the three dimensional 
structure can facilitate the design of modulators of GR-mediated activity. Such 
modulators can lead to therapeutic compounds to treat a wide range of conditions, 

15 including inflammation, tissue rejection, auto-immunity, malignancies such as 
leukemias and lymphomas, Cushing's syndrome, acute adrenal insufficiency, 
congenital adrenal hyperplasia, rheumatic fever, polyarteritis nodosa, 
granulomatous polyarteritis, inhibition of myeloid cell lines, immune 
proliferation/apoptosis, HPA axis suppression and regulation, hypercortisolemia, 

20 modulation of the TH1/TH2 cytokine balance, chronic kidney disease, stroke and 
spinal cord injury, hypercalcemia, hypergylcemia, acute adrenal insufficiency, 
chronic primary adrenal insufficiency, secondary adrenal insufficiency, congenital 
adrenal hyperplasia, cerebral edema, thrombocytopenia, Little's syndrome, 
inflammatory bowel disease, systemic lupus erythematosus, polyartftis nodosa, 

25 Wegener's granulomatosis, giant cell arteritis, rheumatoid arthritis, osteoarthritis, 
hay fever, allergic rhinitis, urticaria, angioneurotic edema, chronic obstructive 
pulmonary disease, asthma, tendonitis, bursitis, Crohn's disease, ulcerative colitis, 
autoimmune chronic active hepatitis, organ transplantation, hepatitis, cirrhosis, 
inflammatory scalp alopecia, panniculitis, psoriasis, discoid lupus erythematosus, 

30 inflamed cysts, atopic dermatitis, pyoderma gangrenosum, pemphigus vulgaris, 
bullous pemphigoid, systemic lupus erythematosus, dermatomyositis, herpes 
gestationis, eosinophilic fasciitis, relapsing polychondritis, inflammatory vasculitis, 
sarcoidosis, Sweet's disease, type 1 reactive leprosy, capillary hemangiomas, 
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contact dermatitis, atopic dermatitis, lichen planus, exfoliative dermatitus 
erythema nodosum, acne, hirsutism, toxic epidermal necrolysis, erythema 
multiform, cutaneous T-ceH lymphoma. Other applications of a GR modulator 
developed in accordance wrth the present invention can be emp,oyed to treat 
Human Immunodeficiency Virus (HIV), cell apoptosis, and can be employed in 
treating cancerous conditions including, but not limited to, Kaposi's sarcoma 
-mmune system activation and modu.ation, desensitization of inflammatory 
responses, IL-1 expression, natural killer cell development, lymphocytic leukemia 
treatment of retinitis pigmentosa. Other applications for such a modulator 
compnse modulating cognrtive performance, memory and learning enhancement 
depression, addiction, mood disorders, chronic fatigue syndrome, schizophrenia' 
stroke, sleep disorders, anxiety, immunostimulants, repressors, wound healing 
and a role as a tissue repair agent or in anti-retroviral therapy. 

Summary of the Invention 
A method of modifying a test NR polypeptide is disclosed. The method can 
compnse: providing a test NR polypeptide sequence having a characteristic that 
* targeted for modification; aligning the test NR polypeptide sequence with at 
least one reference NR polypeptide sequence for which an X-ray structure is 
available, wherein the at least one reference NR polypeptide sequence has a 
characteristic that is desired for the test NR polypeptide; building a three- 
d.mensiona. model for the test NR polypeptide using the three-dimensional 
coordinates of the X-ray structure^) of the at least one reference polypeptide and 
rts sequence alignment with the test NR polypeptide sequence; examining the 
three-dimensional model of the test NR polypeptide for differences with the at 
least one reference polypeptide that are associated with the desired characteristic- 
and mutating at least one amino acid residue in the test NR polypeptide sequence' 
located at a difference identified above to a residue associated with the desired 
characteristic, whereby the test NR polypeptide is modified. 

A method of altering the solubility of a test NR polypeptide is also disclosed 
m accordance with the present invention. In a preferred embodiment, the method 
compnses: (a) providing a reference NR polypeptide sequence and a test NR 
polypeptide sequence; (b) comparing the reference NR polypeptide sequence and 
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the test NR polypeptide sequence to identify one or more residues in the test NR 
sequence that are more or less hydrophilic than a corresponding residue in the 
reference NR polypeptide sequence; and (c) mutating the residue in the test NR 
polypeptide sequence identified in step (b) to a residue having a different 
hydrophilicity, whereby the solubility of the test NR polypeptide is altered. 
Optionally, the reference NR polypeptide sequence is an AR or a PR sequence, 
and the test polypeptide sequence is a GR polypeptide sequence. Alternatively, 
the reference polypeptide sequence is a crystalline GR LBD. The comparing of 
step (b) is preferably by sequence alignment. 

An isolated GR polypeptide comprising a mutation in a ligand binding 
domain, wherein the mutation alters the solubility of the ligand binding domain, is 
also disclosed. An isolated GR polypeptide, or functional portion thereof, having 
one or more mutations comprising a substitution of a hydrophobic amino acid 
residue by a hydrophilic amino acid residue in a ligand binding domain is also 
disclosed. Preferably, in each case, the mutation can be at a residue selected 
from the group consisting of V552, W557, F602, L636, Y648, W712, L741, L535, 
V538, C638, M691, V702, Y648, Y660, L685, M691, V702, W712, L733, Y764 
and combinations thereof. More preferably, the mutation is selected from the 
group consisting of V552K, W557S, F602S, F602D, F602E, L636E, Y648Q, 
W712S, L741R, L535T, V538S, C638S, M691T, V702T, W712T and combinations 
thereof. Antibodies against such polypeptides are also disclosed, as are methods 
of detecting such polypeptides and methods of identifying substances that 
modulate the biological activity of such polypeptides. 

An isolated nucleic acid molecule encoding a GR polypeptide comprising a 
mutation in a ligand binding domain, wherein the mutation alters the solubility of 
the ligand binding domain, or encoding a GR LBD polypeptide, or functional 
portion thereof, having one or more mutations comprising a substitution of a 
hydrophobic amino acid residue by a hydrophilic amino acid residue, is also 
disclosed. A chimeric gene, comprising the nucleic acid molecule operably linked 
to a heterologous promoter, a vector comprising the chimeric gene, and a host cell 
comprising the chimeric gene are also disclosed. Methods for detecting such a 
nucleic acid molecule are also disclosed. 
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A substantially pure GRa ligand binding domain polypeptide in crystalline 
form is disclosed. Preferably, the crystalline form has lattice constants of a = b = 
126.014 A, c - 86.312 A, a = 90*. p = 90°, y = 120°. Preferably, the crystalline 
form ,s a hexagonal crystalline form. More preferably, the crystalline form has a 
5 space group of P 6l . Even more preferably, the GRa ligand binding domain 
polypeptide has the F602S amino acid sequence shown in Example 2. Even 
more preferably, the GRa ligand binding domain has a crystalline structure further 
characterized by the coordinates corresponding to Table 4. 

Preferably, the GRa ligand binding domain polypeptide is in complex with a 
10 ligand. Optionally, the crystalline form contains two GRa ligand binding domain 
polypeptides in the asymmetric unit. Preferably, the crystalline form is such that 
the three-dimensional structure of the crystallized GRa ligand binding domain 
polypeptide can be determined to a resolution of about 2.8 A or better. Even more 
preferably, the crystalline form contains one or more atoms having a molecular 
1 5 weight of 40 grams/mol or greater. 

A method for determining the three-dimensional structure of a crystallized 
GR ligand binding domain polypeptide to a resolution of about 2.8 A or better, the 
method comprising:(a) crystallizing a GR ligand binding domain polypeptide; and 
(b) analyzing the GR ligand binding domain polypeptide to determine the three- 
20 d.mensional structure of the crystallized GR ligand binding domain polypeptide, 
whereby the three-dimensional structure of a crystallized GR ligand binding 
domain polypeptide is determined to a resolution of about 2.8 A or better. 
Preferably, the analyzing is by X-ray diffraction. More preferably, the 
crystallization is accomplished by the hanging drop method, and wherein the GRa 
25 ligand binding domain is mixed with a reservoir. 

A method of generating a crystallized GR ligand binding domain 
polypeptide, the method comprising:(a) incubating a solution comprising a GR 
ligand binding domain with a reservoir; and (b) crystallizing the GR ligand binding 
domain polypeptide using the hanging drop method, whereby a crystallized GR 
30 ligand binding domain polypeptide is generated. 

A method of designing a modulator of a nuclear receptor, the method 
comprising: (a) designing a potential modulator of a nuclear receptor that will 
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make interactions with amino acids in the ligand binding site of the nuclear 
receptor based upon the atomic structure coordinates of a GR ligand binding 
domain polypeptide; (b) synthesizing the modulator; and (c) determining whether 
the potential modulator modulates the activity of the nuclear receptor, whereby a 
5 modulator of a nuclear receptor is designed. 

A method of designing a modulator that selectively modulates the activity of 
a GRa polypeptide the method comprising: (a) obtaining a crystalline form of a 
GRa ligand binding domain polypeptide; (b) determining the three-dimensional 
structure of the crystalline form of the GRa ligand binding domain polypeptide; 

10 and (c) synthesizing a modulator based on the three-dimensional structure of the 
crystalline form of the GRa ligand binding domain polypeptide, whereby a 
modulator that selectively modulates the activity of a GRa polypeptide is 
designed. Preferably, the method further comprises contacting a GRa ligand 
binding domain polypeptide with the potential modulator; and assaying the GRa 

15 ligand binding domain polypeptide for binding of the potential modulator, for a 
change in activity of the GRa ligand binding domain polypeptide, or both. More 
preferably, the crystalline form is in orthorhombic form. Even more preferably, the 
crystals are such that the three-dimensional structure of the crystallized GRa 
ligand binding domain polypeptide can be determined to a resolution of about 2.8 

20 A or better. 

A method of screening a plurality of compounds for a modulator of a GR 
ligand binding domain polypeptide, the method comprising: (a) providing a library 
of test samples; (b) contacting a GR ligand binding domain polypeptide with each 
test sample; (c) detecting an interaction between a test sample and the GR ligand 

25 binding domain polypeptide; (d) identifying a test sample that interacts with the 
GR ligand binding domain polypeptide; and (e) isolating a test sample that 
interacts with the GR ligand binding domain polypeptide, whereby a plurality of 
compounds is screened for a modulator of a GR ligand binding domain 
polypeptide. Preferably, the test samples are bound to a substrate, and more 

30 preferably, the test samples are synthesized directly on a substrate. The GR 
ligand binding domain polypeptide can be in soluble or crystalline form. 

A method for identifying a GR modulator is also disclosed. In a preferred 
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embodiment, the method comprises: (a) providing atomic coordinates of a GR 
ligand binding domain to a computerized modeling system; and (b) modeling 
ligands that fit spatially into the binding pocket of the GR ligand binding domain to 
thereby identify a GR modulator, whereby a GR modulator is identified. 
Preferably, the method further comprises identifying in an assay for GR-mediated 
activity a modeled ligand that increases or decreases the activity of the GR. 

A method of identifying modulator that selectively modulates the activity of 
a GRa polypeptide compared to other GR polypeptides, the method comprising: 
(a) providing atomic coordinates of a GRa ligand binding domain to a 
computerized modeling system; and (b) modeling a ligand that fits into the binding 
pocket of a GRa ligand binding domain and that interacts with conformationally 
constrained residues of a GRa conserved among GR subtypes, whereby a 
modulator that selectively modulates the activity of a GRa polypeptide compared 
to other polypeptides is identified. Preferably, the method further comprises 
identifying in a biological assay for GRa activity a modeled ligand that selectively 
binds to GRa and increases or decreases the activity of said GRa. 

A method of designing a modulator of a GR polypeptide, the method 
comprising: (a) selecting a candidate GR ligand; (b) determining which amino acid 
or amino acids of a GR polypeptide interact with the ligand using a three- 
dimensional model of a crystallized protein comprising a GRa LBD; (c) identifying 
in a biological assay for GR activity a degree to which the ligand modulates the 
activity of the GR polypeptide; (d) selecting a chemical modification of the ligand 
wherein the interaction between the amino acids of the GR polypeptide and the 
ligand is predicted to be modulated by the chemical modification; (e) synthesizing 
a chemical compound with the selected chemical modification to form a modified 
ligand; (f) contacting the modified ligand with the GR polypeptide; (g) identifying in 
a biological assay for GR activity a degree to which the modified ligand modulates 
the biological activity of the GR polypeptide; and (h) comparing the biological 
activity of the GR polypeptide in the presence of modified ligand with the biological 
activity of the GR polypeptide in the presence of the unmodified ligand, whereby a 
modulator of a GR polypeptide is designed. Preferably, the GR polypeptide is a 
GRa polypeptide. More preferably, the three-dimensional model of a crystallized 
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protein is a GRa LBD polypeptide with a bound ligand. Optionally, the method 
further comprises repeating steps (a) through (f), if the biological activity of the GR 
polypeptide in the presence of the modified ligand varies from the biological 
activity of the GR polypeptide in the presence of the unmodified ligand. 
5 An assay method for identifying a compound that inhibits binding of a 

ligand to a GR polypeptide, the assay method comprising:(a) designing a test 
inhibitor compound based on the three dimensional atomic coordinates of GR; (b) 
incubating a GR polypeptide with a ligand in the presence of a test inhibitor 
compound; (c) determining an amount of ligand that is bound to the GR 

10 polypeptide, wherein decreased binding of ligand to the GR protein in the 
presence of the test inhibitor compound relative to binding of ligand in the 
absence of the test inhibitor compound is indicative of inhibition; and (d) 
identifying the test compound as an inhibitor of ligand binding if decreased ligand 
binding is observed, whereby a compound that inhibits binding of a ligand to a GR 

1 5 polypeptide is identified. 

A method of identifying a NR modulator that selectively modulates the 
biological activity of one NR compared to GRa is also disclosed. The method 
comprises: (a) providing an atomic structure coordinate set describing a GRa 
ligand binding domain structure and at least one other atomic structure coordinate 

20 set describing a NR ligand binding domain, each ligand binding domain 
comprising a ligand binding site; (b) comparing the atomic structure coordinate 
sets to identify at least one diference between the sets; (c) designing a candidate 
ligand predicted to interact with the difference of step (b); (d) synthesizing the . 
candidate ligand; and (e) testing the synthesized candidate ligand for an ability to 

25 selectively modulate a NR as compared to GRa, whereby a NR modulator that 
selectively modulates the biological activity NR compared to GRa is identified. 

Accordingly, it is an object of the present invention to provide a three 
dimensional structure of the ligand binding domain of a GR. The object is 
achieved in whole or in part by the present invention. 

30 An object of the invention having been stated hereinabove, other objects 

will be evident as the description proceeds, when taken in connection with the 
accompanying Drawings and Laboratory Examples as best described 
hereinbelow. 
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Brief Description of the Drawings 

Figure 1A depicts E. coli expression of mutant 6xHisGST-GR(521-777) 
F602S (SEQ ID NO:12) via SDS-PAGE. Staining was accomplished using the 
commercially available PROBLUE product. 

Figure 1B depicts E coli expression of mutant 6xHisGST-GR(52 1-777) 
F602D (SEQ ID NO:14) via SDS-PAGE. Staining was accomplished using the 
commercially available PROBLUE product. 

Figure 1C depicts purification of E. coli expressed GR(521-777)F602S 
(SEQ ID NO: 12) via SDS-PAGE. Staining was accomplished using the 
commercially available PROBLUE product. 

Figure 1 D shows the partial purification of E. Coli expressed GR (521-777) 
for several mutants isolated by the Lad Fusion system. 

Figures 2A-2C depict characterization of GR binding to dexamethasone 
and the TIF2 LXXLL (SEQ ID NO:1 8) motif. 

Figure 2A is a graph depicting the binding of 10 nM fluorescein 
dexamethasone to varied concentrations of GST-GR LBD (F602S) 521-777 
(circles), GR LBD (F602S) 521-777 (triangles) and GR LBD (F602S) 521-777 in 
the presence of 100 uM unlabeled dexamethasone (squares) as measured by 
fluorescence polarization. 

Figure 2B is a graph depicting ligand-dependent binding of TIF2 
LXXLL(SEQ ID NO: 18) motif to GR LBD. The binding of varied concentrations of 
GST-GR LBD (F602S) 521-777 to immobilized TIF2 732-756 peptide (SEQ ID 
NO: 17) in the presence of a five-fold excess of dexamethasone (triangles), RU486 
(squares) and no compound (circles) was measured by surface plasmon 
resonance. Each point is the average of two determinations. 

Figure 2C is a graph depicting that TIF2 coactivator peptide enhances 
stability of GR dexamethasone binding activity. The effect of 25 uM coactivator 
peptide TIF2 732-756 (diamonds) or no peptide (squares) on the binding of GST- 
GR LBD (F602S) 521-777 to 10 nM fluorescein dexamethasone with time is 
determined by fluorescence polarization. 

Figure 3A is a worm/ribbon diagram depicting the overall arrangement of 
the GR LBD diamers. Two GR LBDs are shown in white and gray worm 
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representation. TIF2 peptides are shown in gray ribbon and the two 
dexamethasone ligands are shown in space filling. 

Figure 3B is a worm/ribbon diagram depicting one orientation of the 
GRyTIF2/Dex complex. TIF peptide is shown in ribbon and GR is shown in worm. 
5 The AF2 helix of the GR is shown in gray worm. The key structural elements are 
marked and are described herein below. 

Figure 3C is a worm/ribbon diagram depicting a second orientation of the 
GR/TIF2/DEX complex. TIF2 peptide is shown in ribbon and GR is shown in 
- worm. The AF2 helix of GR is shown in gray worm. The key structural elements 
10 are marked and are described herein below. 

Figures 4A and 4B depict the overlap of the GR LBD with the AR LBQ 
(Figure 4A) and the PR LBD (Figure 4B). The GR is in thick line. AR and PR are 
in the thin line. Only the backbone C alpha atoms are shown. 

Figure 5 is a sequence alignment of steroid receptors, particularly an 
15 alignment of the F602S GRa sequence (SEQ ID NO:31) with MR(SEQ ID NO:26), 
PR(SEQ ID NO:27), AR(SEQ ID NO:28), ERa(SEQ ID NO:29), and ER(3(SEQ ID 
NO:30). Residues that lie within 5.0 angstroms of the ligand are identified with 
small square boxes around the one-letter amino acid code. The ligands used for 
this calculation are dexamethasone (for GR), progesterone (for PR), 
20 dihydrotestosterone (for AR), estradiol (for ERa) and genistein (for ERP). The 
alpha-helices and beta-strands observed in the X-ray structures are identified by 
the larger boxes and captions. Note that the secondary structure of MR is not 
publicly known at this time, and thus is not annotated in the Figure. More than 
one structure is available for PR, AR, ERa and ERp, and, in some cases, the 
25 alpha-helices have different endpoints in these different X-ray structures. The 
variation in the alpha-helices is indicated here by using boxes with thicker and 
thinner linewidths, where the thicker linewidth box encompasses residues that 
adopt the same secondary structure in all available X-ray structures, and thinner 
linewidth boxes encompass residues that adopt an alpha-helical structure in some 
30 but not all X-ray structures. The secondary structures were determined by 
graphical examination of the X-ray structures. 

Figure 6A depicts the GR ligand binding pocket. The GR LBD is shown in 
a worm representation and the pocket is shown with a white surface. 
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Figure 6B is a diagram that depicts surfaces at the GR-dexamethasone 
interface. The electron density is calculated with Fo coefficiency and shown at a 
one sigma cutoff. Key residues surrounding the ligand are also labeled, as 
described herein below. 

Figure 7 is a diagram of molecular interactions between GR and 
dexamethasone. Both Van der Waals contacts and hydrogen bonds are indicated 
with dotted lines. 

Figure 8 is a wire frame diagram showing the structure around the F602 
mutation in the GRa LBD polypeptide. The lipophilic F602 side-chain of the wild- 
type GRa protein would be located in a hydrophilic environment and could 
destabilize the protein. Changing the phenylalanine (F) to a serine (S) allows the 
S602 side-chain and NH group to make direct hydrogen bonds with two water 
molecules (1H20 and 2H20). Other residues involved with the two water 
molecules are also shown and are described herein below. 

Brief Description of Sequences in the Sequence Listing 

SEQ ID NOs:1 and 2 are, respectively, a DNA sequence encoding a wild 
type full-length human glucocorticoid receptor (GenBank Accession No. 31679) 
and the amino acid sequence (GenBank Accession No. 121069) of a human 
glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:3 and 4 are, respectively, a DNA sequence encoding a F602S 
full-length human glucocorticoid receptor and the amino acid sequence of a 
human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:5 and 6 are, respectively, a DNA sequence encoding a F602D 
full-length human glucocorticoid receptor and the amino acid sequence of a 
human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:7 and 8 are, respectively, a DNA sequence encoding a 
preferred embodiment of a full-length human glucocorticoid receptor of the 
present invention and the amino acid sequence of a human glucocorticoid 
receptor encoded by the DNA sequence. These sequences thus include variable 
amino acids at the following locations: V552, W557, F602, L636, Y648, W712, 
L741, L535, V538, C638, M691, V702, Y648, Y660, L685. M691, V702, W712, 
L733, and Y764, thus reflecting the mutagenesis approach of the present 
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invention disclosed herein below. Thus, a full length human glucocorticoid 
receptor of the present invention can include a mutation at any one of these 
residues, and/or at any combination of these residues. 

SEQ ID NOs:9 and 10 are, respectively, a DNA sequence encoding a wild 
5 type ligand binding domain of a human glucocorticoid receptor and the amino acid 
sequence of a human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:11 and 12 are, respectively, a DNA sequence encoding a 
ligand binding domain (residues 521-777) of a human glucocorticoid receptor 
containing a phenylalanine to serine mutation at residue 602 and the amino acid 
1 0 sequence of a human glucocorticoid receptor encoded by the DNA sequence. 

SEQ ID NOs:13 and 14 are, respectively, a DNA sequence encoding a 
ligand binding domain (residues 521-777) of a human glucocorticoid receptor 
containing a phenylalanine to aspartic acid mutation at residue 602 and the amino 
acid sequence of a human glucocorticoid receptor encoded by the DNA 
1 5 sequence. 

SEQ ID NOs:15 and 16 are, respectively, a DNA sequence encoding a 
preferred embodiment of a ligand binding domain of a human glucocorticoid 
receptor of the present invention and the amino acid sequence of a human 
glucocorticoid receptor encoded by the DNA sequence. These sequences thus 
20 include variable amino acids at the following locations: V552, W557, F602 ( L636, 
Y648, W712, L741, L535, V538, C638, M691, V702, Y648, Y660, L685, M691, 
V702, W712, L733, and Y764, thus reflecting the mutagenesis approach of the 
present invention disclosed herein below. Thus, a ligand binding domain of a 
human glucocorticoid receptor of the present invention can include a mutation at 
25 any one of these residues, and/or at any combination of these residues. 

SEQ ID NO:17 is an amino acid sequence of amino acid residues 732-756 
of the human TIF2 protein. 

SEQ ID NO:18 is an LXXLL motif of the human T1F2 protein. 

SEQ ID NOs:19-20 are oligonucleotide primers used to engineer a 
30 polyhistidine tag in frame to the sequence encoding glutathione S-transferase 
(GST). 

SEQ ID NO:21 is the resulting amino acid sequence of the modified GST. 
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SEQ ID NOs:22-25 are oligonucleotide primers used in the mutagenesis 
approach of the present invention. 

SEQ ID NOs:26-31 are the ligand binding domain polypeptides of MR(SEQ 
ID NO:26), PR(SEQ ID NO:27), AR(SEQ ID NO:28), ERa(SEQ ID N029) 
ER3(SEQ ID NO:30), and F602S GRa(SEQ ID NO:31) respectively. All of these 
sequences are also shown in Figure 5. Note that the GRa sequence shown of 
SEQ ID NO:31 starts at residue 527, whereas the F602S sequence of SEQ ID 
NO:12 starts at residue 521. 

SEQ ID NO:32 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor containing a phenylalanine 
to senne mutation at residue 602, wherein the first two residues comprise a 
thrombin cleavage site encoded by vector. 

SEQ ID NO: 33 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a W557R 



1 5 mutation. 



SEQ ID NO: 34 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a Q615L 



mutation. 



SEQ ID NO: 35 is an amino acid sequence of a ligand binding domain 
20 (residues 521-777) of a human glucocorticoid receptor comprising a Q615H 
mutation. 

SEQ ID NO: 36 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a A574T 
mutation. 



SEQ ID NO: 37 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising a L620M 
mutation. 

SEQ ID NO: 38 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising the double 
30 mutation F602L/A580T. 

SEQ ID NO: 39 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation L563F/G583C. 
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SEQ ID NO: 40 is an amino acid sequence of a ligand binding domain 
(residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation L664H/M752T. 

SEQ ID NO: 41 is an amino acid sequence of a ligand binding domain 
5 (residues 521-777) of a human glucocorticoid receptor comprising the double 
mutation L563F/T744N. 

Detailed Description of the Invention 
The present invention provides for the generation of NR, SR and GR 

10 polypeptides and NR, SR or GR mutants (preferably GRa and GRa LBD 
mutants), and the ability to solve the crystal structures of those that crystallize. 
Indeed, a GRa LBD having a point mutation was crystallized and solved in one 
aspect of the present invention. Thus, an aspect of the present invention involves 
the use of both targeted and random mutagenesis of the GR gene for the 

15 production of a recombinant protein with improved solution characteristics for the 
purpose of crystallization, characterization of biologically relevant protein-protein 
interactions, and compound screening assays. The present invention, relating to 
GR LBD F602S and other LBD mutations, shows that GR can be overexpressed 
using an E.coli expression system and that active GR protein can be purified, 

20 assayed, and crystallized. 

Until disclosure of the present invention presented herein, the ability to 
obtain crystalline forms of the ligand binding domain of GRa has not been 
realized. And until disclosure of the present invention presented herein, a detailed 
three-dimensional crystal structure of a GRa LBD polypeptide has not been 

25 solved. 

In addition to providing structural information, crystalline polypeptides 
provide other advantages. For example, the crystallization process itself further 
purifies the polypeptide, and satisfies one of the classical criteria for homogeneity. 
In fact, crystallization frequently provides unparalleled purification quality, 
30 removing impurities that are not removed by other purification methods such as 
HPLC, dialysis, conventional column chromatography, and other methods. 
Moreover, crystalline polypeptides are som etimes_sta ble at ambient temperatures 
and free of protease contamination and other degradation associated with solution 
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storage. Crystalline polypeptides can also be useful as pharmaceutical 
preparations. Finally, crystallization techniques in general are largely free of 
problems such as denaturation associated with other stabilization methods (e.g., 
lyophilization). Once crystallization has been accomplished, crystallographic data 
provides useful structural infonnation that can assist the design of compounds that 
can serve as modulators (e.g. agonists or antagonists), as described herein 
below. In addition, the crystal structure provides information useful to map a 
receptor binding domain, which can then be mimicked by a chemical entity that 
can serve as an antagonist or agonist. 



L Definitions 

Following long-standing patent law convention, the terms "a" and "an" 
mean "one or more" when used in this application, including the claims. 

As used herein, the term "agonist" means an agent that supplements or 
15, potentiates the bioactivity of a functional GR gene or protein or of a polypeptide 
encoded by a gene that is up- or down-regulated by a GR polypeptide and/or a 
polypeptide encoded by a gene that contains a GR binding site or response 
element in its promoter region. By way of specific example, an "agonist 1 is a 
compound that interacts with the steroid hormone receptor to promote a 
transcriptional response. An agonist can induce changes in a receptor that places 
the receptor in an active conformation that allows them to influence transcription, 
either positively or negatively. There can be several different ligand-induced 
changes in the receptor's conformation. The term "agonist" specifically 
encompasses partial agonists. 

25 As used herein, the terms "a-helix", "alpha-helix" and "alpha helix" are used 

interchangeably and mean the conformation of a polypeptide chain wherein the 
polypeptide backbone is wound around the long axis of the molecule in a left- 
handed or right-handed direction, and the R groups of the amino acids protrude 
outward from the helical backbone, wherein the repeating unit of the structure is a 

30 single turnoff the helix, which extends about 0.56 nm along the long axis. 

As used herein, the term "antagonist" means an agent that decreases or 
inhibits the bioactivity of a functional GR gene or protein, or that supplements or 
potentiates the bioactivity of a naturally occurring or engineered non-functional GR 
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gene or protein. Alternatively, an antagonist can decrease or inhibit the bioactivity 
of a functional gene or polypeptide encoded by a gene that is up- or down- 
regulated by a GR polypeptide and/or contains a GR binding site or response 
element in its promoter region. An antagonist can also supplement or potentiate 
5 the bioactivity of a naturally occurring or engineered non-functional gene or 
polypeptide encoded by a gene that is up- or down-regulated by a GR 
polypeptide, and/or contains a GR binding site or response element in its 
promoter region. By way of specific example, an "antagonist" is a compound that 
interacts with the steroid hormone receptor to inhibit a transcriptional response. 

10 An antagonist can bind to a receptor but fail to induce conformational changes 
that alter the receptor's transcriptional regulatory properties or physiologically 
relevant conformations. Binding of an antagonist can also block the binding and 
therefore the actions of an agonist. The term "antagonist" specifically 
encompasses partial antagonists. 

15 As used herein, the terms "p-sheef, "beta-sheet" and "beta sheet" are used 

interchangeably and mean the conformation of a polypeptide chain stretched into 
an extended zig-zig conformation. Portions of polypeptide chains that run 
"parallel" all run in the same direction. Polypeptide chains that are "antiparallel" 
run in the opposite direction from the parallel chains. 

20 As used herein, the terms "binding pocket of the GR ligand binding 

domain", "GR ligand binding pocket" and "GR binding pocket" are used 
interchangeably, and refer to the large cavity within the GR ligand binding domain 
where a ligand can bind. This cavity can be empty, or can contain water 
molecules or other molecules from the solvent, or can contain ligand atoms. The 

25 main binding pocket is the region of space encompassed the residues depicted 
Figure 7. The binding pocket also includes regions of space near the "main" 
binding pocket that not occupied by atoms of GR but that are near the "main" 
binding pocket, and that are contiguous with the "main" binding pocket. 

As used herein, the term "biological activity" means any observable effect 

30 flowing from interaction between a GR polypeptide and a ligand. Representative, 
but non-limiting, examples of biological activity in the context of the present 
invention include transcription regulation, ligand binding and peptide binding. 
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As used herein, the terms "candidate substance" and "candidate 
compound" are used interchangeably and refer to a substance that is believed to 
interact with another moiety, for example a given ligand that is believed to interact 
wrth a complete, cr a fragment of, a GR polypeptide, and which can be 
subsequently evaluated for such an interaction. Representative candidate 
substances or compounds include xenobiotics such as drugs and other 
therapeut,c agents, carcinogens and environmental pollutants, natural products 
and extracts, as well as endobiotics such as glucocorticosteroids, steroids, fatty 
acds and prostaglandins. Other examples of candidate compounds that can be 
mvestigated using the methods of the present invention include, but are not 
restncted to, agonists and antagonists of a GR polypeptide, toxins and venoms, 
viral eprtopes, hormones (e.g., glucocorticosteroids, opioid peptides, steroids 
etc.), hormone receptors, peptides, enzymes, enzyme substrates, co-factors 
torn, sugars, oligonucleotides or nucleic acids, oligosaccharides, proteins, small 
15 molecules and monoclonal antibodies. 

As used herein, the terms "cells," "host cells" or "recombinant host cells- 
are used interchangeably and mean not only to the particular subject cell, but also 
to the progeny or potential progeny of such a cel.. Because certain medications 
can occur ,n succeeding generations due to either mutation or environmental 
influences, such progeny might not, in fact, be identical to the parent cell, but are 
still included within the scope of the term as used herein. 

As used herein, the terms "chimeric protein" or "fusion protein" are used 
interchangeably and mean a fusion of a first amino acid sequence encoding a GR 
polypeptide with a second amino acid sequence defining a polypeptide domain 
fore,gn to, and not homologous with, any domain of a GR polypeptide. A chimeric 
proton can include a foreign domain that is found in an organism that also 
expresses the first protein, or it can be an "interspecies" or "intergenic" fusion of 
protem structures expressed by different kinds of organisms. In general, a fusion 
protein can be represented by the general formula X-GR-Y. wherein GR 
represents a portion of the protein which is derived from a GR polypeptide, and X 
and Y are independently absent or represent amino acid sequences which are not 
related to a GR sequence in an organism, which includes naturally occumng 
mutants. 
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As used herein, the term "co-activator" means an entity that has the ability 
to enhance transcription when it is bound to at least one other entity. The 
association of a co-activator with an entity has the ultimate effect of enhancing the 
transciption of one or more sequences of DNA. In the context of the present 
5 invention, transcription is preferably nuclear receptor-mediated. By way of 
specific example, in the present invention TIF2 (the human analog of mouse 
glucocorticoid receptor interaction protein 1 (GRIP1)) can bind to a site on the 
glucorticoid receptor, an event that can enhance transcription. TIF2 is therefore a 
co-activator of the glucocorticoid receptor. Other GR co-activators can include 
10 SRC1. 

As used herein, the term "co-repressor" means an entity that has the ability 
to repress transcription when it is bound to at least one other entity. In the context 
of the present invention, transcription is preferably nuclear receptor-mediated. 
The association of a co-repressor with an entity has the ultimate effect of 
1 5 repressing the transciption of one or more sequences of DNA. 

As used herein, the term "crystal lattice" means the array of points defined 
by the vertices of packed unit cells. 

As used herein, the term "detecting" means confirming the presence of a 
target entity by observing the occurrence of a detectable signal, such as a 
20 radiologic or spectroscopic signal that will appear exclusively in the presence of 
the target entity. 

As used herein, the term "DNA segment" means a DNA molecule that has 
been isolated free of total genomic DNA of a particular species. In a preferred 
embodiment, a DNA segment encoding a GR polypeptide refers to a DNA 

25 segment that comprises any of the odd numbered SEQ ID NOs:1-16, but can 
optionally comprise fewer or additional nucleic acids, yet is isolated away from, or 
purified free from, total genomic DNA of a source species, such as Homo sapiens. 
Included within the term "DNA segment" are DNA segments and smaller 
fragments of such segments, and also recombinant vectors, including, for 

30 example, plasmids, cosmids, phages, viruses, and the like. 

As used herein, the term "DNA sequence encoding a GR polypeptide" can 
refer to one or more coding sequences within a particular individual. Moreover, 
certain differences in nucleotide sequences can exist between individual 



10 



15 



20 



25 



30 



WO 03/015692 

PCT/US02/22648 

-25- 

organisms, which are ca.led alleles. It is possible that such allelic differences 
might or might not result in differences in amino acid sequence of the encoded 
polypeptide yet still encode a protein with the same biological activity. As is well 
known, genes for a particular polypeptide can exist in single or multiple copies 
within the genome of an individual. Such duplicate genes can be identical or can 
have certain modifications, including nucleotide substitutions, additions or 
deletions, all of which still code for polypeptides having substantially the same 
activity. 

As used herein, the phrase "enhancer-promoter" means a composite unit 
that contains both enhancer and promoter elements. An enhancer-promoter is 
operatives linked to a coding sequence that encodes at least one gene product. 

As used herein, the term "expression" generally refers to the cellular 
processes by which a biologically active polypeptide is produced. 

As used herein, the term "gene" is used for simplicity to refer to a functional 
protem, polypeptide or peptide encoding unit. As will be understood by those in 
the art, this functional term includes both genomic sequences and cDNA 
sequences. Preferred embodiments of genomic and cDNA sequences are 
disclosed herein. 

As used herein, the term "glucocorticoid" means a steroid hormone 
glucocorticoid. "Glucocorticoids" are agonists for the glucocorticoid receptor 
Compounds which mimic glucocorticoids are also be defined as glucocorticoid 
receptor agonists. A preferred glucocorticoid receptor agonist is dexamethasone 
Other common glucocorticoid receptor agonists include Cortisol, cortisone 
prednisolone, prednisone, methylprednisolone, trimcinolone, hydrocortisone, and 
corticosterone. As used herein, glucocorticoid is intended to include, for example, 
the following generic and brand name corticosteroids: cortisone (CORTONE 
ACETATE, ADRESON, ALTESONA, CORTELAN, CORTISTAB, CORTISYL 
CORTOGEN, CORTONE, SCHEROSON); dexamethasone-oral (DECADRON- 
ORAL, DEXAMETH, DEXONE, HEXADROL-ORAL, DEXAMETHASONE 
INTENSOL, DEXONE 0.5, DEXONE 0.75, DEXONE 1.5, DEXONE 4)- 
hydrocortisone-oral (CORTEF, HYDROCORTONE); hydrocortisone cypionate 
(CORTEF ORAL SUSPENSION); methylprednisolone-oral (MEDROL-ORAL)- 
prednisolone-oral (PRELONE, DELTA-CORTEF, PEDIAPRED, ADNISOLONE 
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CORTALONE, DELTACORTRIL, DELTASOLONE, DELTASTAB, DI-ADRESON 
F, ENCORTOLONE, HYDROCORTANCYL, MEDISOLONE, M ETI CO RTELO N E , 
OPREDSONE, PANAAFCORTELONE, PRECORTISYL, PRENISOLONA, 
SCHERISOLONA, SCHERISOLONE); prednisone (DELTASONE, LIQUID PRED, 
5 METICORTEN, ORASONE 1, ORASONE 5, ORASONE 10, ORASONE 20, 
ORASONE 50, PREDNICEN-M, PREDNISONE INTENSOL, STERAPRED, 
STERAPRED DS, ADASONE, CARTANCYL, COLISONE, CORDROL, CORTAN, 
DACORTIN, DECORTIN, DECORTISYL, DELCORTIN, DELLACORT, DELTA- 
DOME, DELTACORTENE, DELTISONA, DIADRESON, ECONOSONE, 

10 ENCORTON, FERNISONE, NISONA, NOVOPREDNISONE, PANAFCORT, 
PANASOL, PARACORT, PARMENISON, PEHACORT, PREDELTIN, 
PREDNICORT, PREDNICOT, PREDNIDIB, PREDNIMENT, RECTODELT, 
ULTRACORTEN, WINPRED); triamcinolone-oral (KENACORT, ARISTOCORT, 
ATOLONE, SHOLOG A, TRAMACORT-D, TRI-MED, TRIAMCOT, TRISTO-PLEX, 

15 TRYLONE D, U-TRI-LONE). 

As used herein, the term "glucocorticoid receptor," abbreviated herein as 
"GR," means the receptor for a steroid hormone glucocorticoid. A glucocorticoid 
receptor is a steroid receptor and, consequently, a nuclear receptor, since steroid 
receptors are a subfamily of the superfamily of nuclear receptors. The term "GR" 

20 means any polypeptide sequence that can be aligned with human GR such that at 
least 70%, preferably at least 75%, of the amino acids are identical to the 
corresponding amino acid in the human GR. The term "GR" also encompasses 
nucleic acid sequences where the corresponding translated protein sequence can 
be considered to be a GR. The term "GR" includes invertebrate homologs, 

25 whether now known or hereafter identified; preferably, GR nucleic acids and 
polypeptides are isolated from eukaryotic sources. "GR" further includes 
vertebrate homologs of GR family members, including, but not limited to, 
mammalian and avian homologs. Representative mammalian homologs of GR 
family members include, but are not limited to, murine and human homologs. 

30 "GR" specifically encompasses all GR isoforms, including GRa and GRp. GRp is 
a splicing variant with 100% identity to GRa, except at the C-terminus, where 50 
residues in GRa have been replaced with 15 residues in GRp. 
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As used herein, the terms "GR gene product", "GR protein", "GR 
polypeptide", and "GR peptide" are used interchangeably and mean peptides 
having amino acid sequences which are substantially identical to native amino 
acid sequences from the organism of interest and which are biologically active in 
that they comprise all or a part of the amino acid sequence of a GR polypeptide, 
or cross-react with antibodies raised against a GR polypeptide, or retain all or 
some of the biological activity (e.g., DNA or ligand binding ability and/or 
transcriptional regulation) of the native amino acid sequence or protein. Such 
biological activity can include immunogenicity. Representative embodiments are 
set forth in any even numbered SEQ ID NOs:2-16. The terms "GR gene product", 
"GR protein", "GR polypeptide", and "GR peptide" also include analogs of a GR 
polypeptide. By "analog" is intended that a DNA or peptide sequence can contain 
alterations relative to the sequences disclosed herein, yet retain all or some of the 
biological activity of those sequences. Analogs can be derived from genomic 
nucleotide sequences as are disclosed herein or from other organisms, or can be 
created synthetically. Those skilled in the art will appreciate that other analogs, as 
yet undisclosed or undiscovered, can be used to design and/or construct GR 
analogs. There is no need for a "GR gene product", "GR protein", "GR 
polypeptide", or "GR peptide" to comprise all or substantially all of the amino acid 
sequence of a GR polypeptide gene product. Shorter or longer sequences are 
anticipated to be of use in the invention; shorter sequences are herein referred to 
as "segments". Thus, the terms "GR gene product", "GR protein", "GR 
polypeptide", and "GR peptide" also include fusion or recombinant GR 
polypeptides and proteins comprising sequences of the present invention. 
Methods of preparing such proteins are disclosed herein and are known in the art. 

As used herein, the terms "GR gene" and "recombinant GR gene" mean a 
nucleic acid molecule comprising an open reading frame encoding a GR 
polypeptide of the present invention, including both exon and (optionally) intron 
sequences. 

As used herein, "hexagonal unit cell" means a unit cell wherein a = b * c; 
and a = p = 90, y = 120°. The vectors a, b and c describe the unit cell edges and 
the angles a, p, and y describe the unit cell angles. In a preferred embodiment of 
the present invention, the unit cell has lattice constants of a = b =126.014 A, c = 
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86.312 A, a = 90°, p = 90°, y = 120°. While preferred lattice constants are 
provided, a crystalline polypeptide of the present invention also comprises 
variations from the preferred lattice constants, wherein the varations range from 
about one to about two percent. Thus, for example, a crystalline polypeptide of 
5 the present invention can also comprise lattice constants of about 125 or about 
127. 

As used herein, the term "hybridization" means the binding of a probe 
molecule, a molecule to which a detectable moiety has been bound, to a target 
sample. 

10 As used herein, the term "interact" means detectable interactions between 

molecules, such as can be detected using, for example, a yeast two hybrid assay. 
The term "interact" is also meant to include "binding" interactions between 
molecules. Interactions can, for example, be protein-protein or protein-nucleic 
acid in nature. 

15 As used herein, the term "intron" means a DNA sequence present in a 

given gene that is not translated into protein. 

As used herein, the term "isolated" means oligonucleotides substantially 

free of other nucleic acids, proteins, lipids, carbohydrates or other materials with 

which they can be associated, such association being either in cellular material or 
20 in a synthesis medium. The term can also be applied to polypeptides, in which 

case the polypeptide will be substantially free of nucleic acids, carbohydrates, 

lipids and other undesired polypeptides. . 

As used herein, the term "labeled" means the attachment of a moiety, 

capable of detection by spectroscopic, radiologic or other methods, to a probe 
25 molecule. 

As used herein, the term "modified" means an alteration from an entity's 
normally occurring state. An entity can be modified by removing discrete chemical 
units or by adding discrete chemical units. The term "modified" encompasses 
detectable labels as well as those entities added as aids in purification. 
30 As used herein, the term "modulate" means an increase, decrease, or other 

alteration of any or all chemical and biological activities or properties of a wild-type 
or mutant GR polypeptide, preferably a wild-type or mutant GR polypeptide. The 
term "modulation" as used herein refers to both upregulation (i.e., activation or 
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stimulation) and downregulation (i.e. inhibition or suppression) of a response, and 
includes responses that are upregulated in one cell type or tissue, and down- 
regulated in another cell type or tissue. 

As used herein, the term "molecular replacement" means a method that 
involves generating a preliminary model of the wild-type GR ligand binding 
domain, or a GR mutant crystal whose structure coordinates are unknown, by 
orienting and positioning a molecule or model whose structure coordinates are 
known (e.g., a nuclear receptor) within the unit cell of the unknown crystal so as 
best to account for the observed diffraction pattern of the unknown crystal. 
Phases can then be calculated from this model and combined with the observed 
amplitudes to give an approximate Fourier synthesis of the structure whose 
coordinates are unknown. This, in turn, can be subject to any of the several forms 
of refinement to provide a final, accurate structure of the unknown crystal. See, 
e.g., Lattman, (1985) Method Enzymol., 115: 55-77; Rossmann . ed, (1972) The 
Molecular R eplacement Method . Gordon & Breach, New York. Using the structure 
coordinates of the ligand binding domain of GR provided by this invention, 
molecular replacement can be used to determine the structure coordinates of a 
crystalline mutant or homologue of the GR ligand binding domain, or of a different 
crystal form of the GR ligand binding domain. 

As used herein, the term "mutation" carries its traditional connotation and 
means a change, inherited, naturally occurring or introduced, in a nucleic acid or 
polypeptide sequence, and is used in its sense as generally known to those of skill 
in the art. 

As used herein, the term "nuclear receptor", occasionally abbreviated 
herein as "NR", means a member of the superfamily of receptors that comprises 
at least the subfamilies of steroid receptors, thryroid hormone receptors, retinoic 
acid receptors and vitamin D receptors. Thus, a given nuclear receptor can be 
further classified as a member of a subfamily while retaining its status as a 
nuclear receptor. 

As used herein, the phrase "operatively linked" means that an enhancer- 
promoter is connected to a coding sequence in such a way that the transcription 
of that coding sequence is controlled and regulated by that enhancer-promoter. 
Techniques for operatively linking an enhancer-promoter to a coding sequence 
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are well known in the art; the precise orientation and location relative to a coding 
sequence of interest is dependent, inter alia, upon the specific nature of the 
enhancer-promoter. 

As used herein, the term "partial agonist" means an entity that can bind to a 
5 receptor and induce only part of the changes in the receptors that are induced by 
agonists. The differences can be qualitative or quantitative. Thus, a partial 
agonist can induce some of the conformation changes induced by agonists, but 
not others, or it can only induce certain changes to a limited extent. 

As used herein, the term "partial antagonist" means an entity that can bind 

10 to a receptor and inhibit only part of the changes in the receptors that are induced 
by antagonists. The differences can be qualitative or quantitative. Thus, a partial 
antagonist can inhibit some of the conformation changes induced by an 
antagonist, but not others, or it can inhibit certain changes to a limited extent. 

As used herein, the term "polypeptide" means any polymer comprising any 

15 of the 20 protein amino acids, regardless of its size. Although "protein" is often 
used in reference to relatively large polypeptides, and "peptide" is often used in 
reference to small polypeptides, usage of these terms in the art overlaps and 
varies. The term "polypeptide" as used herein refers to peptides, polypeptides 
and proteins, unless otherwise noted. As used herein, the terms "protein", 

20 "polypeptide" and "peptide" are used interchangeably herein when referring to a 
gene product. 

As used herein, the term "primer" means a sequence comprising two or 
more deoxyribonucleotides or ribonucleotides, preferably more than three, and 
more preferably more than eight and most preferably at least about 20 nucleotides 
25 of an exonic or intronic region. Such oligonucleotides are preferably between ten 
and thirty bases in length. 

As used herein, the term "sequencing" means the determining the ordered 
linear sequence of nucleic acids or amino acids of a DNA or protein target sample, 
using conventional manual or automated laboratory techniques. 
30 As used herein, the term "space group" means the arrangement of 

symmetry elements of a crystal. 

As used herein, the term "steroid receptor" means a nuclear receptor that 
can bind or associate with a steroid compound. Steroid receptors are a subfamily 
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of the superfamily of nuclear receptors. The subfamily of steroid receptors 
comprises glucocorticoid receptors and, therefore, a glucocorticoid receptor is a 
member of the subfamily of steroid receptors and the superfamily of nuclear 
receptors. 

As used herein, the terms "structure coordinates" and "structural 
coordinates" mean mathematical coordinates derived from mathematical 
equations related to the patterns obtained on diffraction of a monochromatic beam 
of X-rays by the atoms (scattering centers) of a molecule in crystal form. The 
diffraction data are used to calculate an electron density map of the repeating unit 
of the crystal. The electron density maps are used to establish the positions of the 
individual atoms within the unit cell of the crystal. 

Those of skill in the art understand that a set of coordinates determined by 
X-ray crystallography is not without standard error. In general, the error in the 
coordinates tends to be reduced as the resolution is increased, since more 
experimental diffraction data is available for the model fitting and refinement. 
Thus, for example, more diffraction data can be collected from a crystal that 
diffracts to a resolution of 2.8 angstroms than from a crystal that diffracts to a 
lower resolution, such as 3.5 angstroms. Consequently, the refined structural 
coordinates will usually be more accurate when fitted and refined using data from 
a crystal that diffracts to higher resolution. The design of ligands and modulators 
for GR or any other NR depends on the accuracy of the structural coordinates. If 
the coordinates are not sufficiently accurate, then the design process will be 
ineffective. In most cases, it is very difficult or impossible to collect sufficient 
diffraction data to define atomic coordinates precisely when the crystals diffract to 
a resolution of only 3.5 angstroms or poorer. Thus, in most cases, it is difficult to 
use X-ray structures in structure-based ligand design when the X-ray structures 
are based on crystals that diffract to a resolution of only 3.5 angstroms or 
poorer. However, common experience has shown that crystals diffracting to 2.8 
angstroms or better can yield X-ray structures with sufficient accuracy to greatly 
facilitate structure-based drug design. Further improvement in the resolution can 
further facilitate structure-based design, but the coordinates obtained at 2.8 
angstroms resolution are generally adequate for most purposes. 

Also, those of skill in the art will understand that NR proteins can adopt 
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different conformations when different ligands are bound. In particular, NR 
proteins will adopt substantially different conformations when agonists and 
antagonists are bound. Subtle variations in the conformation can also occur when 
different agonists are bound, and when different antagonists are bound. These 
5 variations can be difficult or impossible to predict from a single X-ray structure. 
Generally, structure-based design of GR modulators depends to some degree on 
a knowledge of the differences in conformation that occur when agonists and 
antagonists are bound. Thus, structure-based modulator design is most facilitated 
by the availability of X-ray structures of complexes with potent agonists as well as 

1 0 potent antagonists. 

As used herein, the term "substantially pure" means that the polynucleotide 
or polypeptide is substantially free of the sequences and molecules with which it is 
associated in its natural state, and those molecules used in the isolation 
procedure. The term "substantially free" means that the sample is at least 50%, 

1 5 preferably at least 70%, more preferably 80% and most preferably 90% free of the 
materials and compounds with which is it associated in nature. 

As used herein, the term "target cell" refers to a cell, into which it is desired 
to insert a nucleic acid sequence or polypeptide, or to otherwise effect a 
modification from conditions known to be standard in the unmodified cell. A 

20 nucleic acid sequence introduced into a target cell can be of variable length. 
Additionally, a nucleic acid sequence can enter a target cell as a component of a 
plasmid or other vector or as a naked sequence. 

As used herein, the term "transcription" means a cellular process involving 
the interaction of an RNA polymerase with a gene that directs the expression as 

25 RNA of the structural information present in the coding sequences of the gene. 
The process includes, but is not limited to the following steps: (a) the transcription 
initiation, (b) transcript elongation, (c) transcript splicing, (d) transcript capping, (e) 
transcript termination, (f) transcript polyadenylation, (g) nuclear export of the 
transcript, (h) transcript editing, and (i) stabilizing the transcript. 

30 As used herein, the term "transcription factor" means a cytoplasmic or 

nuclear protein which binds to such gene, or binds to an RNA transcript of such 
gene, or binds to another protein which binds to such gene or such RNA transcript 

m 

or another protein which in turn binds to such gene or such RNA transcript, so as 
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to thereby modulate expression of the gene. Such modulation can additionally be 
achieved by other mechanisms; the essence of "transcription factor for a gene" is 
that the level of transcription of the gene is altered in some way. 

As used herein, the term "unit cell" means a basic parallelipiped shaped 
block. The entire volume of a crystal can be constructed by regular assembly of 
such blocks. Each unit cell comprises a complete representation of the unit of 
pattern, the repetition of which builds up the crystal. Thus, the term "unit cell- 
means the fundamental portion of a crystal structure that is repeated infinitely by 
translation in three dimensions. A unit cell is characterized by three vectors a, b, 
and c, not located in one plane, which form the edges of a parallelepiped. Angles 
a, p and y define the angles between the vectors: angle a is the angle between 
vectors b and c; angle p is the angle between vectors a and c; and angle y is the 
angle between vectors a and b. The entire volume of a crystal can be constructed 
by regular assembly of unit cells; each unit cell comprises a complete 
representation of the unit of pattern, the repetition of which builds up the crystal. 

IL Description of Tables 

Table 1 is chart of sequence identity between the ligand binding domains of 
several nuclear receptors. 

Table 2 is a table listing mutations of the GR LBD (521-777) gene for 
testing solution solubility and stability. SEQ ID NOs:7-8 and 15-16 also comprise 
these mutations. Candidate mutated residues include but are not limited to Cys, 
Asn, Tyr, Lys, Ser, Asp, Glu, Gin, Arg or Thr. 

Table 2A is a table listing mutations that were discovered using the Lacl- 
based "peptides-on-plasmids" technique with GR LBD. 

Table 3 is a table summarizing the crystal and data statistics obtained from 
the crystallized ligand binding domain of GRa LBD that was co-crystallized with 
dexamethasone and a fragment of the co-activator TIF2. Data on the unit cell are 
presented, including data on the crystal space group, unit cell dimensions, 
molecules per asymmetric cell and crystal resolution. 

Table 4 is a table of .the atomic structure coordinate data obtained from X- 
ray diffraction from the ligand binding domain of GR (residues 521-777) in 
complex with desamethasone and a fragment of the co-activator TIF2. 
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Table 5 is a table of the atomic structure coordinates used as the initial 
model to solve the structure of the GR/TIF2/dexamethasone complex by 
molecular replacement. The GR model is a homology model built on the 
published structure of the progesterone receptor LBD and the SRC1 coactivator 
5 peptide from the PPARa/Compound 1/SRC1 structure. 

IIL General Considerations 

The present invention will usually be applicable mutatis mutandis to nuclear 
receptors in general, more particularly to steroid receptors and even more 

10 particularly to glucocorticoid receptors, including GR isoforms, as discussed 
herein, based, in part, on the patterns of nuclear receptor and steroid receptor 
structure and modulation that have emerged as a consequence of the present 
disclosure, which in part discloses determining the three dimensional structure of 
the ligand binding domain of GRa in complex with dexamethasone and a 

15 fragment of the co-activator TIF2. 

The nuclear receptor superfamily has been subdivided into two subfamilies: 
the GR subfamily (also referred to as the steroid receptors and denoted SRs), 
comprising GR, AR (androgen receptor), MR (mineralcorticoid receptor) and PR 
(progesterone receptor) and the thyroid hormone receptor (TR) subfamily, 

20 comprising TR, vitamin 0 receptor (VDR), retinoic acid receptor (RAR), retinoid X 
receptor (RXR), and most orphan receptors. This division has been made on the 
basis of DNA binding domain structures, interactions with heat shock proteins 
(HSP), and ability to form dinners. 

Steroid receptors (SRs) form a subset of the superfamily of nuclear 

25 receptors. The glucocorticoid receptor is a steroid receptor and thus a member of 
the superfamily of nuclear receptors and the subset of steroid receptors. The 
human glucocorticoid receptor exists in two isoforms, GRa which consists of 777 
amino acids and GRp which consists of 742 amino acids. As noted, the alpha 
isoform of human glucocorticoid receptor is made up of 777 amino acids and is 

30 predominantly cytoplasmic in its unactivated, non-DNA binding form. When 
activated, it translocates to the nucleus. In order to understand the role played by 
the glucocorticoid receptor in the different cell processes, the receptor was 
mapped by transfecting receptor-negative and glucocorticoid-resistant cells with 



WO 03/015692 PCT/US02/22648 

-35- 

different steroid receptor constructs and reporter genes like chloramphenicol 
acyltransferase (CAT) or luciferase which had been covalently linked to a 
glucocorticoid responsive element (GRE). From these and other studies, four 
major functional domains have become evident. 
5 From amino to carboxyl terminal end, these functional domains include the 

tau 1 , DNA binding, and ligand binding domains in succession. The tau 1 domain 
spans amino acid positions 77-262 and regulates gene activation. The DNA 
binding domain is from amino acid positions 421-486 and has nine cysteine 
residues, eight of which are organized in the form of two zinc fingers analogous to 
10 Xenopus transcription factor IMA. The DNA binding domain binds to the regulatory 
sequences of genes that are induced or deinduced by glucocorticoids. Amino 
acids 521 to 777 form the ligand binding domain, which binds glucocorticoid to 
activate the receptor. This region of the receptor also has the nuclear localization 
signal. Deletion of this carboxyl terminal end results in a receptor that is 

15 constitutively active for gene induction (up to 30% of wild type activity) and even 
more active for cell kill (up to 150% of wild type activity) (Giguere et al. . (1986) 
Cell 46: 645-652; Hollenberg et al. . (1987) Cell 49: 39-46; Hollenberg & Evans . 
(1988) Cell 55: 899-906; Hollenberg et al. . (1989) Cancer Res. 49: 2292s-2294s; 
Qro et al - ( 1988 ) Cell 55: 1109-1114; Evans , (1989) in Recent Progress in 

20 Hormone Research (Clark, ed.) Vol. 45, pp. 1-27, Academic Press, San Diego, 
California : Green & Chambon. (1987) Nature 325: 75-78; Picard & Yamamoto . 
(1987) EMBO J. 6: 3333-3340; Picard et al. , (1990) Cell Regul. 1: 291-299; 
Godowski et al., (1987) Nature 325: 365-368; Miesfeld et al. , (1987) Science 
236:423-427; Danielsen et al. . (1989) Cancer Res. 49: 2286s-2291s; Danielsen et 

25 aL, (1987) Molec. Endocrinol. 1: 816-822; Umesono & Evans . (1989) Cell 57: 
1 139-1 146.). Despite the aforementioned indirect characterization of the structure 
of GRa, until the present disclosure, a detailed three-dimensional model of the 
ligand binding domain of GRa has not been achieved. 

GR subgroup members are tightly bound by heat shock protein(s) (HSP) in 

30 the absence of ligand, dimerize following ligand binding and dissociation of HSP, 
and show homology in the DNA half sites to which they bind. These half sites 
also tend to be arranged as palindromes. TR subgroup members tend to be 
bound to DNA or other chromatin molecules when unliganded, can bind to DNA 
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as monomers and dimers, but tend to form heterodimers, and bind DNA elements 
with a variety of orientations and spacings of the half sites, and also show 
homology with respect to the nucleotide sequences of the half sites. ER does not 

* 

belong to either subfamily, since it resembles the GR subfamily in hsp 
5 interactions, and the TR subfamily in nuclear localization and DNA-binding 
properties. 

Most members of the superfamily, including orphan receptors, possess at 
least two transcription activation subdomains, one of which is constitutive and 
resides in the amino terminal domain (AF-1), and the other of which (AF-2) 

10 resides in the ligand binding domain, whose activity is regulated by binding of an 
agonist ligand. The function of AF-2 requires an activation domain (also called 
transactivation domain) that is highly conserved among the receptor superfamily. 
Most LBDs contain an activation domain. Some mutations in this domain abolish 
AF-2 function, but leave ligand binding and other functions unaffected. Ligand 

15 binding allows the activation domain to serve as an interaction site for essential 
co-activator proteins that function to stimulate (or in some cases, inhibit) 
transcription. 

: Analysis and alignment of amino acid sequences, and X-ray and NMR 
structure determinations, have shown that nuclear receptors have a modular 
20 architecture with three main domains: 

1 ) a variable amino-terminal domain; 

2) a highly conserved DNA-binding domain (DBD); and 

3) a less conserved carboxy-terminal ligand binding domain (LBD). 

In addition, nuclear receptors can have linker segments of variable length 
25 between these major domains. Sequence analysis and X-ray crystallography, 
including the disclosure of the present invention, have confirmed that GR also has 
the same general modular architecture, with the same three domains. The 
function of GR in human cells presumably requires all three domains in a single 
amino acid sequence. However, the modularity of GR permits different domains 
30 of each protein to separately accomplish certain functions. Some of the functions 
of a domain within the full-length receptor are preserved when that particular 
domain is isolated from the remainder of the protein. Using conventional protein 
chemistry techniques, a modular domain can sometimes be separated from the 
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parent protein. Using conventional molecular biology techniques, each domain 
can usually be separately expressed with its original function intact or, as 
discussed herein below, chimeras comprising two different proteins can be 
constructed, wherein the chimeras retain the properties of the individual functional 
domains of the respective nuclear receptors from which the chimeras were 
generated. 

The carboxy-terminal activation subdomain, is in close three dimensional 
proximity in the LBD to the ligand, so as to allow for ligands bound to the LBD to 
coordinate (or interact) with amino acid(s) in the activation subdomain. As 
described herein, the LBD of a nuclear receptor can be expressed, crystallized, its 
three dimensional structure determined with a ligand bound (either using crystal 
data from the same receptor or a different receptor or a combination thereof), and 
computational methods used to design ligands to its LBD. particularly ligands that 
contain an extension moiety that coordinates the activation domain of the nuclear 
1 5 receptor. 

The LBD is the second most highly conserved domain in these receptors. 
As its name suggests, the LBD binds ligands. With many nuclear receptors, 
including GR, binding of the ligand can induce a conformational change in the 
LBD that can, in turn, activate transcription of certain target genes. Whereas 
integrity of several different LBD sub-domains is important for ligand binding, 
truncated molecules containing only the LBD retain normal ligand-binding activity. 
This domain also participates in other functions, including dimerization, nuclear 
translocation and transcriptional activation, as described herein. 

Nuclear receptors usually have HSP binding domains that present a region 
25 for binding to the LBD and can be modulated by the binding of a ligand to the 
LBD. For many of the nuclear receptors ligand binding induces a dissociation of 
heat shock proteins such that the receptors can form dimers in most cases, after 
which the receptors bind to DNA and regulate transcription. Consequently, a 
ligand that stabilizes the binding or contact of the heat shock protein binding 
30 domain with the LBD can be designed using the computational methods described 
herein. 

With the receptors that are associated with the HSP in the absence of the 
ligand, dissociation of the HSP results in dimerization of the receptors. 
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Dimerization is due to receptor domains in both the DBD and the LBD. Although 
the main stimulus for dimerization is dissociation of the HSP, the ligand-induced 
conformational changes in the receptors can have an additional facilitative 
influence. With the receptors that are not associated with HSP in the absence of 
5 the ligand, particularly with the TR, ligand binding can affect the pattern of 
dimerization. The influence depends on the DNA binding site context, and can 
also depend on the promoter context with respect to other proteins that can 
interact with the receptors. A common pattern is to discourage monomer 
formation, with a resulting preference for heterodimer formation over dimer 

10 formation on DNA. 

Nuclear receptor LBDs usually have dimerization domains that present a 
region for binding to another nuclear receptor and can be modulated by the 
binding of a ligand to the LBD. Consequently, a ligand that disrupts the binding or 
contact of the dimerization domain can be designed using the computational 

1 5 methods described herein to produce a partial agonist or antagonist. 

The amino terminal domain of GR is the least conserved of the three 
domains. This domain is involved in transcriptional activation and, its uniqueness 
might dictate selective receptor-DNA binding and activation of target genes by GR 
subtypes. This domain can display synergistic and antagonistic interactions with 

20 the domains of the LBD. 

The DNA binding domain has the most highly conserved amino acid 
sequence amongst the GRs. It typically comprises about 70 amino acids that fold 
into two zinc finger motifs, wherein a zinc atom coordinates four cysteines. The 
DBD comprises two perpendicularly oriented a-helixes that extend from the base 

25 of the first and second zinc fingers. The two zinc fingers function in concert along 
with non-zinc finger residues to direct the GR to specific target sites on DNA and 
to align receptor dimer interfaces. Various amino acids in the DBD influence 
spacing between two half-sites (which usually comprises six nucleotides) for 
receptor dimerization. The optimal spacings facilitate cooperative interactions 

30 between DBDs, and D box residues are part of the dimerization interface. Other 
regions of the DBD facilitate DNA-protein and protein-protein interactions are 
involved in dimerization. 
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In nuclear receptors that bind to a HSP, the ligand-induced dissociation of 
HSP with consequent dimer formation allows, and therefore, promotes DNA 
binding. With receptors that are not associated (as in the absence of ligand), 
ligand binding tends to stimulate DNA binding of heterodimers and dimers, and to 
5 discourage monomer binding to DNA. However, with DNA containing only a 
single half site, the ligand tends to stimulate the receptor's binding to DNA. The 
effects are modest and depend on the nature of the DNA site and probably on the 
presence of other proteins that can interact with the receptors. Nuclear receptors 
usually have DBD (DNA binding domains) that present a region for binding to 
0 DNA and this binding can be modulated by the binding of a ligand to the LBD. 

The modularity of the members of the nuclear receptor superfamily permits 
different domains of each protein to separately accomplish different functions, 
although the domains can influence each other. The separate function of a 
domain is usually preserved when a particular domain is isolated from the 
5 remainder of the protein. Using conventional protein chemistry techniques a 
modular domain can sometimes be separated from the parent protein. By 
employing conventional molecular biology techniques each domain can usually be 
separately expressed with its original function intact or chimerics of two different 
nuclear receptors can be constructed, wherein the chimerics retain the properties 
of the individual functional domains of the respective nuclear receptors from which 
the chimerics were generated. 

Various structures have indicated that most nuclear receptor LBDs adopt 
the same general folding pattern. This fold consists of 10-12 alpha helices 
arranged in a bundle, together with several beta-strands, and linking segments. A 
preferred GRa LBD structure of the present invention has 10-11 helices, 
depending on whether helix-3' is counted. Structural studies have shown that 
most of the alpha-helices and beta-strands have the same general position and 
orientation in all nuclear receptor structures, whether ligand is bound or not. 
However, the AF2 helix has been found in different positions and orientations 
relative to the main bundle, depending on the presence or absence of the ligand, 
and also on the chemical nature of the ligand. These structural studies have 
suggested that many nuclear receptors share a common mechanism of activation, 
where binding of activating ligands helps to stabilize the AF2 helix in a position 
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and orientation adjacent to helices-3, -4, and -10, covering an opening to the 
ligand binding site. This position and orientation of the AF2 helix, which will be 
called the "active conformation", creates a binding site for co-activators. See, e.g. , 
Nolte et al. , (1998) Nature 395:137-43; Shiau et al. , (1998) Cell 95: 927-37. This 
5 co-activator binding site has a central lipophilic pocket that can accommodate 
leucine side-chains from co-activators, as well as a "charge-clamp" structure 
consisting essentially of a lysine residue from helix-3 and a glutamic acid residue 
from the AF2 helix. 

Structural studies have shown that co-activator peptides containing the 
10 sequence LXXLL (where L is leucine and X can be a different amino acid in 
different cases) can bind to this co-activator binding site by making interactions 
with the charge clamp lysine and glutamic acid residues, as well as the central 
lipophilic region. This co-activator binding site is disrupted when the AF2 helix is 
shifted into other positions and orientations. In PPARy, activating ligands such as 
15 rosiglitazone (BRL49653) make a hydrogen bonding interaction with tyrosine-473 
in the AF2 helix. Nolte et al. , (1998) Nature 395:137-43; Gampe et al. , (2000) 
Mol. Cell 5: 545-55. Similarly, in GR, the dexamethasone ligand makes van der 
Waals interaction with the side chain of leucine-753 from the AF2 helix. This 
interaction is believed in part to stabilize the AF2 helix in the active conformation, 
20 thereby allowing co-activators to bind and thus activating transcription from target 
genes. 

With certain antagonist ligands, or in the absence of any ligand, the AF2 
helix can be held less tightly in the active conformation, or can be free to adopt 
other conformations. This would either destabilize or disrupt the co-activator 

25 binding site, thereby reducing or eliminating co-activator binding and transcription 
from certain target genes. Some of the functions of the GR protein depend on 
having the full-length amino acid sequence and certain partner molecules, such as 
co-activators and DNA. However, other functions, including ligand binding and 
ligand-dependent conformational changes, can be observed experimentally using 

30 isolated domains, chimeras and mutant molecules. 

As described herein, the LBD of a GR can be mutated or engineered, 
expressed, crystallized, its three dimensional structure determined with a ligand 
bound as disclosed in the present invention, and computational methods can be 
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used to design ligands to nuclear receptors, preferably to steroid receptors, and 
more preferably to glucocorticoid receptors. 

•V. The Dexamethasone Lig and 
5 Ligand binding can induce transcriptional activation functions in a variety of 

ways. One way is through the dissociation of the HSP from receptors. This 
dissociation, with consequent dimerization of the receptors and their binding to 
DNA or other proteins in the nuclear chromatin, allows transcriptional regulatory 
properties of the receptors to be manifest. This can be especially true of such 
1 0 functions on the amino terminus of the receptors. 

Another way is to alter the receptor to interact with other proteins involved 
in transcription. These could be proteins that interact directly or indirectly with 
elements of the proximal promoter or proteins of the proximal promoter. 
Alternatively, the interactions can be through other transcription factors that 
themselves interact directly or indirectly with proteins of the proximal promoter. 
Several different proteins have been described that bind to the receptors in a 
ligand-dependent manner. In addition, it is possible that in some cases, the 
ligand-induced conformational changes do not affect the binding of other proteins 
to the receptor, but do affect their abilities to regulate transcription. 

In one aspect of the present invention, a GR LBD was co-crystallized with a 
fragment of the co-activator TIF2 and the ligand dexamethasone. 
Dexamethasone is a synthetic adrenocortical steroid with a molecular weight of 
392.47. The IUPAC name for dexamethasone is (11p, 16a)-9-fluoro-1ip,17,21- 
trihydroxy-16a-methylpregna-1-4-diene-3,20-dione. The empirical formula for 

dexamethasone is C^FOs. Dexamethasone is represented by the chemical 
structure: 
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H-C-OH 

2 I 

c=o 




Oexamethasone-based therapeutics are commercially available in a variety 
of forms and formulations. Dexamethasone can also be purchased from various 
5 suppliers such as Sigma (St. Louis Misouri), as well as starting materials for the 
synthesis of dexamethasone. The synthesis of dexamethasone, and 
dexamethasone derivatives, is known and described in a variety of sources, 
including Arth et al. , (1958) J. Am. Chem. Soc. 80: 3161; Oliveto et al M (1958) J. 
Am. Chem. Soc. 4431, Fried & Sabo , (1954) J. Am Chem. Soc. 76: 1455; 
10 Hirschman et al , (1956) J. Am. Chem. Soc. 78: 4957 and U.S. Patent No. 
3,007,923 to Mulleret al. , all of which are incorporated herein in their entirety. 

V. The TIF2 Fragment 

15 The nuclear receptor co-activator TIF2 (SEQ ID NO:17) was co-crystallized 

in one aspect of the present invention. Structurally, the nuclear receptor 
coactivator TIF2 comprises one domain that reacts with a nuclear receptor 
(nuclear receptor interaction domain, abbreviated "NID") and two autonomous 
activation domains, AD1 and AD2 (Voegel et al. , (1998) EMBO J. 17: 507-519). 

20 The TIF2 NID comprises three NR-interacting modules, with each module 
comprising the motif, LXXLL (SEQ ID NO:18) ( Voegel et al. , (1998) EMBO J. 17: 
507-519). Mutation of the motif abrogates TIF2*s ability to interact with the ligand- 
induced activation function-2 (AF-2) found in the ligand-binding domains (LBDs) of 
many NRs. Presently, it is thought that TIF2 AD1 activity is mediated by CREB 

25 binding protein (CBP), however, TIF2 AD2 activity does not appear to involve 
interaction with CBP (Voegel et al. , (1 998) EMBO J. 17: 507-51 9). 
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ln the present invention, residues 732-756 of the TIF2 protein (SEQ ID 
NO.-17) were co-crystallized with GR and dexamethasone. These residues 
comprise the LXXLL (SEQ , D NO:18) of AD-2, the third motif in the linear 
sequence of TIF2. The T.F2 fragment is 25 residues in .ength and was 
synthes,zed using an automated peptide synthesis apparatus. SEQ ID NO-17 
and other sequences corresponding to TIF2 and other co-activators and co- 
repressors, can be similarly synthesized using automated apparatuses. 

^ Production of N R, SR and GR Pnly poptiH^ 

In a preferred embodiment, the present invention provides for the first time 
for the expression of a soluble GR polypeptide in bacteria, more preferably in E 
col, The GR polypeptides of the present invention, disclosed herein, can thus 
now prov,de a variety of host-expression vector systems to express an NR SR or 
GR cod,ng sequence. These include but are not limited to microorganisms such 
as bacteria transformed wrth recombinant bacteriophage DNA, plasmid DNA or 
cosm,d DNA expression vectors containing an NR, SR or GR coding sequence- 
yeast transformed with recombinant yeast expression vectors containing an NR* 
SR or GR coding sequence; insect cell systems infected with recombinant virus 
express.on vectors (e.g., baculovirus) containing an NR, SR or GR coding 
sequence; plant cell systems infected with recombinant virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed 
wrth recombinant plasmid expression vectors (e.g., Ti plasmid) containing an NR 
SR or GR coding sequence; or animal cell systems. The expression elements of 
these systems vary in their strength and specificities. Methods for constructing 
express.on vectors that comprise a partial or the entire native or mutated NR and 
GR polypeptide coding sequence and appropriate transcriptional/translational 
control signals include in vitro recombinant DNA techniques, synthetic techniques 
and m vivo recombination/genetic recombination. See, for example the 
techn.ques described throughout Sambrook et ah . (1989) Molecular Cloning- A 
i^oratpryj^, Cold Spring Harbor Laboratory. New York, and Ausubel et al 
(1989) Current Protocols in Molecular Biology, Greene Publishing Associates and 
W.ley Interscience, New York, both incorporated herein in their entirety. 
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Depending on the host/vector system utilized, any of a number of suitable 
transcription and translation elements, including constitutive and inducible 
promoters, can be used in the expression vector. For example, when cloning in 
bacterial systems, inducible promoters such as pL of bacteriophage X, plac, ptrp, 
5 ptac (ptrp-lac hybrid promoter) and the like can be used. When cloning in insect 
cell systems, promoters such as the baculovirus polyhedrin promoter can be used. 
When cloning in plant cell systems, promoters derived from the genome of plant 
cells, such as heat shock promoters; the promoter for the small subunit of 
RUBISCO; the promoter for the chlorophyll a/b binding protein) or from plant 

1 0 viruses (e.g., the 35S RNA promoter of CaMV; the coat protein promoter of TMV) 
can be used. When cloning in mammalian cell systems, promoters derived from 
the genome of mammalian cells (e.g., metallothionein promoter) or from 
mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K 
promoter) can be used. When generating cell lines that contain.multiple copies of 

1 5 the tyrosine kinase domain DNA, SV40-, BPV- and EBV-based vectors can be 
used with an appropriate selectable marker. 

i 

Adequate levels of expression of nuclear receptor LBDs can be obtained by 
the novel approaches described herein. High level expression in £. coli of ligand 
binding domains of TR and other nuclear receptors, including members of the 

20 steroid/thyroid receptor superfamily, such as the estrogen (ER), androgen (AR), 
mineralocorticoid (MR), progesterone (PR), RAR, RXR and vitamin D (VDR) 
receptors can also be achieved after review of the expression of a soluble GR 
polypeptide in bacteria, more preferably, E. coli disclosed herein. The GR 
polypeptides of the present invention, disclosed herein, can thus now provide a 

25 variety of host-expression vector systems. Yeast and other eukaryotic expression 
systems can be used with nuclear receptors that bind heat shock proteins since 
these nuclear receptors are generally more difficult to express in bacteria, with the 
exception of ER, which can be expressed in bacteria. In a preferred embodiment 
of the present invention, as disclosed in the Examples, a GR LBD is expressed in 

■ 

30 E. coli. 

Representative nuclear receptors or their ligand binding domains have 
been cloned and sequenced, including human RARa, human RARy, human 
RXRa, human RXRp, human PPARa, human PPARp or 5 (delta), human PPARy, 
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human VDR, human ER (as described in Seielstad et al. . (1995) Mol. Endocrinol 
9: 647-658), human GR, human PR, human MR, and human AR. The ligand 
b.nd.ng domain of each of these nuclear receptors has been identified. Using this 
•nformafon in conjunction with the methods described herein, one of ordinary skill 
•n the art can express and purify LBDs of any of the nuclear receptors, bind it to 
an appropriate ligand, and crystallize the nuclear receptor's LBD with a bound 
ligand, if desired. 

Extracts of expressing cells are a suitable source of receptor for purification 
and preparation of crystals of the chosen receptor. To obtain such expression a 
vector can be constructed in a manner similar to that employed for expression of 
the rat TR alpha (Apriletti et al., (1995) Protein Expression and Purification 6- 
368-370). The nucleotides encoding the amino acids encompassing the ligand 
b.nd,ng domain of the receptor to be expressed can be inserted into an expression 
vector such as the one employed by Apriletti et al (1995). Stretches of adjacent 
ammo acd sequences can be included if more structural information is desired. 

The native and mutated nuclear receptors in general, and more particularly 
SR and GR polypeptides, and fragments thereof, of the present invention can also 
be chemically synthesized in whole or part using techniques that are known in the 
art (See^CL, Ciejghton, (1983) Proteins: Structure, and Molecular PrinH plo c 
W.H. Freeman & Co., New York, incorporated herein in its entirety). 

In a preferred embodiment, the present invention provides for the first time 
for the expression of a soluble GR polypeptide in bacteria, more preferably E 
col,, and subsequent purification thereof. Representative purification techniques 
are also d.sclosed in the Examples, particularly Example 1. The GR polypeptides 
of the present invention, disclosed herein, can thus now provide the ability to 
employ additional purification techniques for both liganded and unliganded NRs 
Thus, it is envisioned, based upon the disclosure of the present invention, that 
punfication of the unliganded or liganded NR. SR or GR receptor can be obtained 
by conventional techniques, such as hydrophobic interaction chromatography 
(HPLC), ion exchange chromatography (HPLC), and heparin affinity 
chromatography. To achieve higher purification for improved crystals of nuclear 
receptors it is sometimes preferable to ligand shift purify the nuclear receptor 
us.ng a column that separates the receptor according to charge, such as an ion 
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exchange or hydrophobic interaction column, and then bind the eluted receptor 
with a ligand. The ligand induces a change in the receptor's surface charge such 
that when re-chromatographed on the same column, the receptor then elutes at 
the position of the liganded receptor and is removed by the original column run 
5 with the unliganded receptor. Typically, saturating concentrations of ligand can be 
used in the column and the protein can be preincubated with the ligand prior to 
passing it over the column. 

More recently developed methods involve engineering a "tag" such as with 
histidine placed on the end of the protein, such as on the amino terminus, and 
1 0 then using a nickel chelation column for purification. See Janknecht , (1991 ) Proc. 
Natl. Acad. Sci. U.S.A. 88: 8972-8976 (1991), incorporated by reference. 

VII. Formation of NR, SR and GR Ligand Binding Domain Crystals 

In one embodiment, the present invention provides crystals of GRa LBD. 

15 The crystals were obtained using the methodology disclosed in the Laboratory 
Examples. The GRa LBD crystals, which can be native crystals, derivative 
crystals or co-crystals, have hexagonal unit cells (a hexagonal unit cell is a unit 
cell wherein a = b * c, and wherein a = p = 90°, and y = 120°) and space group 
symmetry P6i. There are two GRa LBD molecule in the asymmetric unit. In this 

20 GRa crystalline form, the unit cell has dimensions of a = b =126.014 A, c = 86.312 
A, and a = p = 90°, and y = 120°. This crystal form can be formed in a 
crystallization reservoir as described in the Examples. 

VILA . Preparation of NR, SR and GR Crystals 

25 The native and derivative co-crystals, and fragments thereof, disclosed in 

the present invention can be obtained by a variety of techniques, including batch, 
liquid bridge, dialysis, vapor diffusion and hanging drop methods ( See, e.g. , 
McPherson , (1982) Preparation and Analysis of Protein Crystals , John Wiley, New 
York; McPherson , (1990) Eur. J. Biochem. 189:1-23; Weber , (1991) Adv. Protein 

30 Chem. 41:1-36). In a preferred embodiment, the vapor diffusion and hanging drop 
methods are used for the crystallization of NR, SR and GR polypeptides and 
fragments thereof. A more preferred hanging drop method technique is disclosed 
in the Examples. 
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ln general, native crystals of the present invention are grown by dissolving 
substantially pure NR, SR or GR polypeptide or a fragment thereof in an aqueous 
buffer containing a precipitant at a concentration just below that necessary to 
precipitate the protein. Water is removed by controlled evaporation to produce 
precipitating conditions, which are maintained until crystal growth ceases. 

In one embodiment of the invention, native crystals are grown by vapor 
diffusion (See^, McPherson, (1982) Pre paration and Analysis of Protein 
Crystals, John Wiley, New York, McPherson . (1990) Eur. J. Biochem. 189:1-23). 
In this method, the polypeptide/precipitant solution is allowed to equilibrate in a 
closed container with a larger aqueous reservoir having a precipitant 
concentration optimal for producing crystals. Generally, less than about 25 uL of 
NR, SR or GR polypeptide solution is mixed with an equal volume of reservoir 
solution, giving a precipitant concentration about half that required for 
crystallization. This solution is suspended as a droplet underneath a coverslip 
which is sealed onto the top of the reservoir. The sealed container is allowed to 
stand, until crystals grow. Crystals generally form within two to six weeks, and are 
suitable for data collection within approximately seven to ten weeks. Of course 
those of skill in the art will recognize that the above-described crystallization 
procedures and conditions can be varied. 

VILB. Preparation of Derivative Crystals 

Derivative crystals of the present invention, e.g. heavy atom derivative 
crystals, can be obtained by soaking native crystals in mother liquor containing 
salts of heavy metal atoms. Such derivative crystals are useful for phase analysis 
in the solution of crystals of the present invention. In a preferred embodiment of 
the present invention, for example, soaking a native crystal in a solution 
containing methyl-mercury chloride provides derivative crystals suitable for use as 
isomorphous replacements in determining the X-ray crystal structure of a NR, SR 
or GR polypeptide. Additional reagents useful for the preparation of the derivative 
crystals of the present invention will be apparent to those of skill in the art after 
review of the disclosure of the present invention presented herein. 
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Vll.C. Preparation of Co-crystals 

Co-crystals of the present invention can be obtained by soaking a native 
crystal in mother liquor containing compounds known or predicted to bind the LBD 
of a NR, SR or GR, or a fragment thereof. Alternatively, co-crystals can be 
5 obtained by co-crystallizing a NR, SR or GR LBD polypeptide or a fragment 
thereof in the presence of one or more compounds known or predicted to bind the 
polypeptide. In a preferred embodiment, as disclosed in the Examples, such a 
compound is dexamethasone. 

10 VII. D. Solving a Crystal Structure of the Present Invention 

Crystal structures of the present invention can be solved using a variety of 
techniques including, but not limited to, isomorphous replacement, anomalous 
scattering or molecular replacement methods. Computer software packages are 
also helpful in solving a crystal structure of the present invention. Applicable 

15 software packages include but are not limited to the CCP4 package disclosed in 
the Examples, the X-PLOR™ program ( Brunger , (1992) X-PLOR, Version 3.1. A 
System for X-ray Crystallography and NMR, Yale University Press, New Haven, 
Connecticut; X-PLOR is available from Molecular Simulations, Inc., San Diego, 
California), Xtal View ( McRee , (1992) J. Mol. Graphics 10: 44-46; X-tal View is 

20 available from the San Diego Supercomputer Center). SHELXS 97 ( Sheldrick 
(1990) Acta Cryst. A46: 467; SHELX 97 is available from the Institute of Inorganic 
Chemistry, Georg-August-Universitat, Gottingen, Germany), HEAVY (Terwilliger, 
Los Alamos National Laboratory) and SHAKE-AND-BAKE (Hauptman , (1997) 
Curr. Opin. Struct Biol 7: 672-80; Weeks et al. , (1993) Acta Cryst D49: 179; 

25 available from the Hauptman-Woodward Medical Research Institute, Buffalo, New 
York) can be used. See also , Ducruix & Geige , (1992) Crystallization of Nucleic 
Acids and Proteins: A Practical Approach , IRL Press, Oxford, England, and 
references cited therein. 

30 VIII. Characterization and Solution of a GRg Liaand Binding Domain Crystal 

Referring now to Figure 3A, the overall arrangement of the GR LBD dimer 
is depicted in a ribbon/worm diagram that was derived from the crystalline 
polypeptide of the present invention. The two GR LBDs are shown in white and 
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gray worm representation. The TIF2 peptides TIF2 are shown in gray ribbon and 
two dexamethasone ligands DEX are shown in space filling. The N terminus and 
C terminus of each GR LBD are labeled with a C and N, respectively. There is an 
interface between the two LBDs at beta turns and beta strands. 

Referring now to Figures 3B and 3C, two orientations of the GR/TIF2/DEX 
complex are depicted. In each figure, the TIF2 peptide TIF2 is shown in ribbon 
and the GR LBD is shown in worm. The AF2 helix AF2 of GR is shown in gray 
worm in each figure. The key structural elements helix 9 H9 and helix 3 H3 are 
indicated, as is the N terminus N. The DEX compound DEX is shown in dark gray 
shading. In Figures 3B and 3C, the interaction of helix 3 H3 and the AF2 helix 
AF2 with dexamethasone DEX can be seen. 

Referring now to Figures 4A and 4B, the overlap of GR LBD with the LBDs 
of the AR and PR (Figures 4A and 4B, respectively) is depicted. The AR and PR 
are shown as a thin line, while the GR is shown as a thick line. Backbone Calpha 
atoms are also shown. This superposition is consistent with the sequence 
alignment approach taken in the design of the GR LBD polypeptide disclosed 
herein. 

RMS deviation calculation results were as follows: 



GR PR 

GR 0.00 o.94 
p R 0.94 

AR 1 -56 1.34 



AR 
1.56 

0.00 L34 

0.00 



where in each of the three calculations, the RMS deviation was computed 
using 980 N, backbone C alpha, C, O atoms from 245 aligned residues. These 
245 residues are GR:531-775, PR:686-987,899-931 and AR:672-883,885-917. 
Several GR and PR residues before helix-1 were omitted in the calculations, as 
was one residue at the C-terminus, to correspond to the shorter AR construct. 
One residue (PR:898 and AR:884) was also omitted in the 10-AF2 loop because 
of the deletion in GR. The RMS deviations suggest that the AR structure has 
diverged away from GR and PR, and graphical examination confirmed this at least 
qualitatively. 
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Referring now to Figure 5, a sequence alignment of steroid receptors, 
particularly an alignment of the F602S GRa sequence with MR, PR, AR, ERa, and 
Erp is depicted. Residues that lie within 5.0 angstroms of the ligand are identified 
with small square boxes around the one-letter amino acid code. The ligands used 
5 for this calculation are dexamethasone (for GR), progesterone (for PR), 
dihydrotestosterone (for AR), estradiol (for ERa) and genistein (for ER|3). The 
alpha-helices and beta-strands observed in the X-ray structures are identified by 
the larger boxes and captions. Note that the secondary structure of MR is not 
publicly known at this time, and is thus not annotated in the Figure. More than 

10 one structure is available for PR, AR, ERa and ER(3, and, in some cases, the 
alpha-helices have different endpoints in these different X-ray structures. The 
variation in the alpha-helices is indicated here by using boxes with thicker and 
thinner linewidths, where the thicker linewidth box encompasses residues that 
adopt the same secondary structure in all available X-ray structures, and thinner 

15 linewidth boxes encompass residues that adopt an alpha-helical structure in some 
but not all X-ray structures. The secondary structures were determined by 
graphical examination of the X-ray structures. 

It is also noted that, within the ligand binding domains (LBDs), the 
sequence identity is as follows: 

20 

Table 1 

Sequence Identity of NR LBDs 
GR MR PR AR 

GR 100% 56% 54% 50% 

25 MR 56% 100% 55% 51% 

PR 54% 55% 100% 55% 

AR 50% 51% 55% 100% 



VIII.A Unique Structural Differences Between GRa and Other SRs 
30 Even though the GR LBD shares over 50% sequence identity with PR and 

AR and fold into a similar three-layer helical sandwich (Figure 4A and 4B), there 
are a number of unique structural differences in their structures. The most distinct 
differences are noted in the extended strand between helices 1 and 3, and the 
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position of helix 7. These differences contribute a unique shape of the binding 
pocket for each receptor (Figures 6A and 6B) and may thus provide a molecular 
basis for steroid specificity of these receptors. The detailed structural information 
about the GR LBD and the pocket provided herein can be further exploited to 
5 design receptor specific agonists or antagoinists. 

VIH.B Dexamethasone 

The ligand binding domain of GRa was co-crystallized with 
dexamethasone, which has the IUPAC name (lip, 1 6cc)-9-fluoro-1 1p, 17,21 - 
1 0 trihydroxy-1 6a-methylpregna-1 -4-diene-3,20-dione and is shown below. 



H 2 C-OH 




Dexamethasone is an agonist of GRa and is useful for treatment of 
GRa-mediated diseases or conditions including inflammation, tissue rejection, 

15 auto-immunity, malignancies such as leukemias and lymphomas, Cushing's 
syndrome, acute adrenal insufficiency, congenital adrenal hyperplasia, rheumatic 
fever, polyarteritis nodosa, granulomatous polyarteritis, inhibition of myeloid cell 
lines, immune proliferation/apoptosis, HPA axis suppression and regulation, 
hypercortisolemia, modulation of the Th1/Th2 cytokine balance, chronic kidney 

20 disease, stroke and spinal cord injury, hypercalcemia, hypergylcemia, acute 
adrenal insufficiency, chronic primary adrenal insufficiency, secondary adrenal 
insufficiency, congenital adrenal hyperplasia, cerebral edema, thrombocytopenia, 
and Little's syndrome as well as many other conditions. 
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VHLC. Characterization of the GRa Binding Pocket and Interactions 

Between GRa and Dexamethasone 
Referring now to Figure 6A, the GR ligand binding pocket is depicted 
schematically. The GR ligand binding pocket is shown in a worm representation 
5 and the pocket is shown with a white surface. The gross shape of the binding 
pocket is depicted here with a smooth surface that covers the available volume 
within the binding pocket. The available volume is mapped by placing the protein 
within a grid, and then checking, for each grid point, whether a spherical probe 
atom can fit at that point without bumping into the protein. The spacing of grid 

10 points was taken as 0.50A, and the radius of the probe atom was taken as 1 .40A. 
Atoms in the protein were represented as spheres with a radius of 1.20A for 
hydrogen, 1.70A for carbon, 1.55A for nitrogen, 1.52A for oxygen and 1.80A for 
sulfur. These are esssentially the atomic radius values suggested by Bondi (A. 
Bondi, "van der Waals Volumes and Radii," Journal of Physical Chemistry, 68, 

15 441-451 (1964)). The protein was represented with all hydrogen atoms in order to 
handle its volume more accurately. These hydrogen atoms where added to obtain 
the protonation states expected at pH 7 using the MVP program. The MVP 
program adds hydrogens using standard geometry, and then refines the initial 
coordinates with energy minimization, holding all heavy atoms fixed. The 

20 "available" grid points are defined as those for which the probe sphere does not 
bump into any sphere corresponding to a protein atom. The smooth surface was 
then constructed over these available binding site grid points using the dot surface 
program of Connolly (Michael L. Connolly, "Solvent-Accessible Surfaces of 
Proteins and Nucleic Acids," Science 221, 709-713 (1983)) with a probe radius of 

25 1 .30A. The protein chain is shown with a backbone ribbon depiction. 

Referring now to Figure 6B, electron density in the GR-dexamethasone 
interface is depicted. The electron density is calculated with Fo coefficiency and 
shown in a one sigma cutoff. The ligand DEX is in the center of the figure. Key 
residues L732, A605, R611, Q570, G567, N564, and F749 encircle ligand DEX. 

30 Ligand DEX displays a good spatial fit, with no overlaps and no apparent charge 
repulsions. 

Referring now to Figure 7, molecular interactions between the GR protein 
and the dexamethasone are depicted. There are 22 residues from GR involved in 
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diFect interactions with the dexamethasone, and the residues are Q570, L566, 
G567, L563, W600, L753, N564, F749, C736, 1747, M560. T739, Q642, Y735, 
L732, M646, M601, A605, F623, M604, L608, and R611. 

VHI.D. Structural Mechanism of Improving Protein Solubility by the F602S 
Mutation 

Figure 8 is a wire frame diagram that provides a closer look at the F602S 
mutation. The F602 is lipophilic but resides in the hydrophilic environment, a 
situation that could destabilize the protein. The mutation of the phenylalanine (F) 
to the serine (S) allows the S602 side chain to make direct hydrogen bonds with 
two water molecules, shown as 1H 2 0 and 2H 2 0 in Figure 8. Association 
distances of 2.416 and 4.036 are indicated between S602 and 1H 2 0 and 2H 2 0, 
respectively. Other residues are also shown in interaction with 1 H 2 0 and 2H 2 0, 
and these include H726 (which is also coordinated with water molecule 1H 2 0), 
Y764 (which is also coordinated with water molecules 1H 2 0 and 2H 2 0), Y598 and 
W600. An association distance of 4.354 is shown between 1H 2 0 and H726; and 
an association distance of 3.286 is shown with Y764. An association distance of 
3.157 is shown between 1H 2 0 and 2H 2 0. It is envisioned that this complex 
hydrogen bond network initiated by the F602S mutation and the two water 
molecules improves the protein stability thus the solubility as well. 

VIII.E. Generation of Easily-Solved NR. SR and GR Crystals 
The present invention discloses a substantially pure GR LBD polypeptide in 
crystalline form. In a preferred embodiment, exemplified in the Figures and 
Laboratory Examples, GRoc is crystallized with bound ligand. Crystals can be 
formed from NR, SR and GR LBD polypeptides that are usually expressed by a 
cell culture, such as E. coli. Bromo- and iodo-substitutions can be included during 
the preparation of crystal forms and can act as heavy atom substitutions in GR 
ligands and crystals of NRs, SRs and GRs. This method can be advantageous for 
the phasing of the crystal, which is a crucial, and sometimes limiting, step in 
solving the three-dimensional structure of a crystallized entity. Thus, the need for 
generating the heavy metal derivatives traditionally employed in crystallography 
can be eliminated. After the three-dimensional structure of a NR, SR or GR, or an 
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NR, SR or GR LBD with or without a ligand bound is determined, the resultant 
three-dimensional structure can be used in computational methods to design 
synthetic ligands for NR. SR or GR and for other NR, SR or GR polypeptides. 
Further activity structure relationships can be determined through routine testing, 
5 using assays disclosed herein and known in the art. 

IX. Uses of NR, SR and GR Crystals and the Three-Dimensional Structure of 
the Ligand Binding Domain of GRcc 

The solved crystal structure of the present invention is useful in the design 

10 of modulators of activity mediated by the glucocorticoid receptor and by other 
nuclear receptors. Evaluation of the available sequence data shows that GRa is 
particularly similar to MR, PR and AR. The GRa LBD has approximately 55%, 
54% and 50% sequence identity to the MR, PR and AR LBDs, respectively. The 
GRp amino acid sequence is identical to the GRa amino acid sequence for 

15 residues 1-726, but the remaining 16 residues in GRJ3 show no significant 
similarity to the remaining 51 residues in GRa. 

The present GRa X-ray structure can also be used to build models for 
targets where no X-ray structure is available, such as with GRp and MR. Indeed, 
a model for GRa using the available X-ray structures of PR and/or AR as 

20 templates was built and used by the present co-inventors to obtain a starting 
model for the molecular replacement calculation used in solving the X-ray 
structure of GRa disclosed herein. These models will be less accurate than X-ray 
structures, but can help in the design of compounds targeted for GRp and MR, for 
example. Also, these models can aid the design of compounds to selectively 

25 modulate any desired subset of GRa, GRp, MR, PR, AR and other related nuclear 
receptors. 

IX.A. Design and Development of NR, SR and GR Modulators 
The present invention, particularly the computational methods, can be used 
30 to design drugs for a variety of nuclear receptors, such as receptors for 
glucocorticoids (GRs), androgens (ARs), mineralocorticoids (MRs), progestins 
(PRs), estrogens (ERs), thyroid hormones (TRs), vitamin D (VDRs), retinoid 
(RARs and RXRs) and peroxisomal proliferators (PPARs). The present invention 
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can also be applied to the "orphan receptors," as they are structurally homologous 
in terms of modular domains and primary structure to classic nuclear receptors, 
such as steroid and thyroid receptors. The amino acid homologies of orphan 
receptors with other nuclear receptors ranges from very low (<15%) to in the 

range of 35% when compared to rat RARa and human TRB receptors, for 
example. 

The knowledge of the structure of the GRa ligand binding domain (LBD), 
an aspect of the present invention, provides a tool for investigating the mechanism 
of action of GRa and other NR, SR and GR polypeptides in a subject. For 
example, various computer modelling programs, as described herein, can predict 
the binding of various ligand molecules to the LBD of GRp, or another steroid 
receptor or, more generally, nuclear receptor. Upon discovering that such binding 
in fact takes place, knowledge of the protein structure then allows design and 
synthesis of small molecules that mimic the functional binding of the ligand to the 
LBD of GRa, and to the LBDs of other polypeptides. This is the method of 
"rational" drug design, further described herein. 

Use of the isolated and purified GRa crystalline structure of the present 
invention in rational drug design is thus provided in accordance with the present 
invention. Additional rational drug design techniques are described in U.S. Patent 
Nos. 5,834,228 and 5,872,01 1 , incorporated herein in their entirety. 

Thus, in addition to the compounds described herein, other sterically similar 
compounds can be formulated to interact with the key structural regions of an NR, 
SR or GR in general, or of GRa in particular. The generation of a structural 
functional equivalent can be achieved by the techniques of modeling and 

i 

chemical design known to those of skill in the art and described herein. It will be 
understood that all such sterically similar constructs fall within the scope of the 
present invention. 

IX. A. 1. Rational Drug Design 

The three-dimensional structure of ligand-binding GRa is unprecedented 
and will greatly aid in the development of new synthetic ligands for NR, SR and 
GR polypeptides, such as GR agonists and antagonists, including those that bind 
exclusively to any one of the GR subtypes. In addition, NRs, SRs and GRs are 
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well suited to modem methods, including three-dimensional structure elucidation 
and combinatorial chemistry, such as those disclosed in U.S. Patent Nos. 
5,463,564, and 6,236,946 incorporated herein by reference. Structure 
determination using X-ray crystallography is possible because of the solubility 

5 properties of NRs SRs and GRs. Computer programs that use crystallography 
data when practicing the present invention will enable the rational design of 
ligands to these receptors. 

Programs such as RASMOL (Biomolecular Structures Group, Glaxo 
Wellcome Research & Development Stevenage, Hertfordshire, UK Version 2.6, 

10 August 1995, Version 2.6.4, December 1998, Copyright © Roger Sayle 1992- 
1999) and Protein Explorer (Version 1.87, July 3, 2001, © Eric Martz, 2001 and 
available online at http://www.umass.edu/microbio/chime/explorer/index.htm) can 
be used with the atomic structural coordinates from crystals generated by 
practicing the invention or used to practice the invention by generating three- 

15 dimensional models and/or determining the structures involved in ligand binding. 
Computer programs such as. those sold under the registered trademark INSIGHT 
II® and the programs GRASP ( Nicholls et al„ (1991) Proteins 11: 281) and 
SYBYL™ (available from Tripos, Inc. of St. Louis, Missouri) allow for further 
manipulations and the ability to introduce new structures. In addition, high 

20 throughput binding and bioactivity assays can be devised using purified 
recombinant protein and modem reporter gene transcription assays known to 
those of skill in the art in order to refine the activity of a designed ligand. 

A method of identifying modulators of the activity of an NR, SR or GR 
polypeptide using rational drug design is thus provided in accordance with the 

4 

25 present invention. The method comprises designing a potential modulator for an 
NR, SR or GR polypeptide of the present invention that will form non-covalent 
interactions with amino acids in the ligand binding pocket based upon the 
crystalline structure of the GRa LBD polypeptide; synthesizing the modulator; and 
determining whether the potential modulator modulates the activity of the NR, SR 

30 or GR polypeptide. In a preferred embodiment, the modulator is designed for an 
SR polypeptide. In a more preferred embodiment, the modulator is designed for a 
GRa polypeptide. Preferably, the GRa polypeptide comprises the amino acid 
sequence of any of SEQ ID NOs:2, 4, 6 and 8, and more preferably, the GRa LBD 
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comprises the amino acid sequence of any of SEQ ID NOs:10, 12, 14, 16 and 31. 

The determination of whether the modulator modulates the biological activity of an 

NR, SR or GR polypeptide is made in accordance with the screening methods 

disclosed herein, or by other screening methods known to those of skill in the art. 

Modulators can be synthesized using techniques known to those of ordinary skill 
in the art. 

* 

In an alternative embodiment, a method of designing a modulator of an NR, 
SR or GR polypeptide in accordance with the present invention is disclosed 
comprising: (a) selecting a candidate NR, SR or GR ligand; (b) determining which 
amino acid or amino acids of an NR. SR or GR polypeptide interact with the ligand 
using a three-dimensional model of a crystallized GRa LBD; (c) identifying in a 
biological assay for NR, SR or GR activity a degree to which the ligand modulates 
the activity of the NR, SR or GR polypeptide; (d) selecting a chemical modification 
of the ligand wherein the interaction between the amino acids of the NR, SR or 
GR polypeptide and the ligand is predicted to be modulated by the chemical 
modification; (e) synthesizing a chemical compound with the selected chemical 
modification to form a modified ligand; (0 contacting the modified ligand with the 
NR, SR or GR polypeptide; (g) identifying in a biological assay for NR, SR or GR 
activity a degree to which the modified ligand modulates the biological activity of 
the NR, SR or GR polypeptide; and (h) comparing the biological activity of the NR, 
SR or GR polypeptide in the presence of modified ligand with the biological 
activity of the NR, SR or GR polypeptide in the presence of the unmodified ligand, 
whereby a modulator of an NR, SR or GR polypeptide is designed. 

An additional method of designing modulators of an NR, SR or GR or an 
NR, SR or GR LBD can comprise: (a) determining which amino acid or amino 
acids of an NR, SR or GR LBD interacts with a first chemical moiety (at least one) 
of the ligand using a three dimensional model of a crystallized protein comprising 
an NR, SR or GR LBD in complex with a bound ligand and a co-activator; and (b) 
selecting one or more chemical modifications of the first chemical moiety to 
produce a second chemical moiety with a structure to either decrease or increase 
an interaction between the interacting amino acid and the second chemical moiety 
compared to the interaction between the interacting amino acid and the first 
chemical moiety. This is a general strategy only, however, and variations on this 
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disclosed protocol would be apparent to those of skill in the art upon consideration 
of the present disclosure. 

Once a candidate modulator is synthesized as described herein and as will 
be known to those of skill in the art upon contemplation of the present invention, it 
5 can be tested using assays to establish its activity as an agonist, partial agonist or 
antagonist, and affinity, as described herein. After such testing, a candidate 
modulator can be further refined by generating LBD crystals with the candidate 
modulator bound to the LBD. The structure of the candidate modulator can then 
be further refined using the chemical modification methods described herein for 
10 three dimensional models to improve the activity or affinity of the candidate 
modulator and make second generation modulators with improved properties, 
such as that of a super agonist or antagonist, as described herein. 

IXA2. Methods for Using the GRa LBD Structural Coordinates For 
15 Molecular Design 

For the first time, the present invention permits the use of molecular design 
techniques to design, select and synthesize chemical entities and compounds, 
including modulatory compounds, capable of binding to the ligand binding pocket 
or an accessory binding site of an NR, SR or GR and an NR, SR or GR LBD, in 
20 whole or in part. Correspondingly, the present invention also provides for the 
application of similar techniques in the design of modulators of any NR, SR or GR 
polypeptide. 

In accordance with a preferred embodiment of the present invention, the 
structure coordinates of a crystalline GRa LBD can be used to design compounds 

25 that bind to a GR LBD (more preferably a GRa LBD) and alter the properties of a 
GR LBD (for example, the dimerization ability, ligand binding ability or effect on 
transcription) in different ways. One aspect of the present invention provides for 
the design of compounds that can compete with natural or engineered ligands of a 
GR polypeptide by binding to all, or a portion of, the binding sites on a GR LBD. 

30 The present invention also provides for the design of compounds that can bind to 
all, or a portion of, an accessory binding site on a GR that is already binding a 
ligand. Similarly, non-competitive agonists/ligands that bind to and modulate GR 
LBD activity, whether or not it is bound to another chemical entity, and partial 
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agonists and antagonists can be designed using the GR LBD structure 
coordinates of this invention. 

A second design approach is to probe an NR, SR or GR or an NR, SR or 
GR LBD (preferably a GRa or GRa LBD) crystal with molecules comprising a 

5 variety of different chemical entities to determine optimal sites for interaction 
between candidate NR, SR or GR or NR, SR or GR LBD modulators and the 
polypeptide. For example, high resolution X-ray diffraction data collected from 
crystals saturated with solvent allows the determination of the site where each 
type of solvent molecule adheres. Small molecules that bind tightly to those sites 

0 can then be designed and synthesized and tested for their an NR, SR or GR 
modulator activity. Representative designs are also disclosed in published PCT 
application WO 99/26966. 

Once a computationally-designed ligand is synthesized using the methods 
of the present invention or other methods known to those of skill in the art, assays 
5 can be used to establish its efficacy of the ligand as a modulator of NR, SR or GR 
(preferably GRa) activity. After such assays, the ligands can be further refined by 
generating intact NR, SR or GR, or NR, SR or GR LBD, crystals with a ligand 
bound to the LBD. The structure of the ligand can then be further refined using 
the chemical modification methods described herein and known to those of skill in 
the art, in order to improve the modulation activity or the binding affinity of the 
ligand. This process can lead to second generation ligands with improved 
properties. 

Ligands also can be selected that modulate NR, SR or GR responsive 
gene transcription by the method of altering the interaction of co-activators and 
co-repressors with their cognate NR, SR or GR. For example, agonistic ligands 
can be selected that block or dissociate a co-repressor from interacting with a GR, 
and/or that promote binding or association of a co-activator. Antagonistic ligands 
can be selected that block co-activator interaction and/or promote co-repressor 
interaction with a target receptor. Selection can be done via binding assays that 
screen for designed ligands having the desired modulatory properties. Preferably, 
interactions of a GRa polypeptide are targeted. A suitable assay for screening 
that can be employed, mutatis mutandis in the present invention, as described in 
Oberfield, J.L, et al., Proc Natl Acad Sci USA. (1999) May 25; 96(1 1):61 02-6, 
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incorporated herein in its entirety by reference. Other examples of suitable 

■ 

screening assays for GR function include an in vitro peptide binding assay 
representing ligand-induced interaction with coactivator (Zhou, et al M (1998) Mol. 
Endocrinol. 12: 1594-1604; Parks et al„ (1999) Science 284: 1365-1368) or a cell- 
5 based reporter assay related to transcription from a GRE (reviewed in Jenkins et 
al. f (2001) Trends Endocrinol. Metab. 12: 122-126) or a cell-based reporter assay 
related to repression of genes driven via NF-kB. DeBosscher et al., (2000) Proc 
Natl Acad Sci USA. 97: 391 9-3924. 

* 

10 IX.A.3. Methods of Designing NR, SR or GR LBD Modulator 

Compounds 

Knowledge of the three-dimensional structure of the GR LBD complex of 
the present invention can facilitate a general model for modulator (e.g. agonist, 
partial agonist, antagonist and partial antagonist) design. Other ligand-receptor 

15 complexes belonging to the nuclear receptor superfamily can have a ligand 
binding pocket similar to that of GR and therefore the present invention can be 
employed in agonist/antagonist design for other members of the nuclear receptor 
superfamily and the steroid receptor subfamily. Examples of suitable receptors 
include those of the NR superfamily and those of the SR subfamily. 

20 The design of candidate substances, also referred to as "compounds" or 

"candidate compounds", that bind to or inhibit NR, SR or GR LBD-mediated 
activity according to the present invention generally involves consideration of two 
factors. First, the compound must be capable of physically and structurally 
associating with a NR, SR or GR LBD. Non-covalent molecular interactions 

25 important in the association of a NR, SR or GR LBD with its substrate include 
hydrogen bonding, van derWaals interactions and hydrophobic interactions. 

The interaction between an atom of a LBD amino acid and an atom of an 
LBD ligand can be made by any force or attraction described in nature. Usually 
the interaction between the atom of the amino acid and the ligand will be the result 

30 of a hydrogen bonding interaction, charge interaction, hydrophobic interaction, 
van der Waals interaction or dipole interaction. In the case of the hydrophobic 
interaction it is recognized that this is not a per se interaction between the amino 
acid and ligand, but rather the usual result, in part, of the repulsion of water or 



WO 03/015692 

PCT/US02/22648 

-61- 

other hydrophilic group from a hydrophobic surface. Reducing or enhancing the 
interaction of the LBD and a ligand can be measured by calculating or testing 
binding energies, computationally or using thermodynamic or kinetic methods as 
known in the art. 

5 Second, the compound must be able to assume a conformation that allows 

it to associate with a NR, SR or GR LBD. Although certain portions of the 
compound will not directly participate in this association with a NR, SR or GR 
LBD, those portions can still influence the overall conformation of the molecule. 
This, in turn, can have a significant impact on potency. Such conformational 
1 0 requirements include the overall three-dimensional structure and orientation of the 
chemical entity or compound in relation to all or a portion of the binding site, e.g., 
the ligand binding pocket or an accessory binding site of a NR, SR or GR LBD, or 
the spacing between functional groups of a compound comprising several 
chemical entities that directly interact with a NR, SR or GR LBD. 
5 Chemical modifications will often enhance or reduce interactions of an 

atom of a LBD amino acid and an atom of an LBD ligand. Steric hinderance can 
be a common means of changing the interaction of a LBD binding pocket with an 
activation domain. Chemical modifications are preferably introduced at C-H, C- 
and C-OH positions in a ligand, where the carbon is part of the ligand structure 
that remains the same after modification is complete. In the case of C-H, C could 
have 1 , 2 or 3 hydrogens, but usually only one hydrogen will be replaced. The H 
or OH can be removed after modification is complete and replaced with a desired 
chemical moiety. 

The potential modulatory or binding effect of a chemical compound on a 
NR, SR or GR LBD can be analyzed prior to its actual synthesis and testing by the 
use of computer modeling techniques that employ the coordinates of a crystalline 
GRa LBD polypeptide of the present invention. If the theoretical structure of the 
given compound suggests insufficient interaction and association between it and a 
NR, SR or GR LBD, synthesis and testing of the compound is obviated. However, 
if computer modeling indicates a strong interaction, the molecule can then be 
synthesized and tested for its ability to bind and modulate the activity of a NR, SR 
or GR LBD. In this manner, synthesis of unproductive or inoperative compounds 
can be avoided. 
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A modulatory or other binding compound of a NR, SR or GR LBD 
polypeptide (preferably a GRa LBD) can be computationally evaluated and 
designed via a series of steps in which chemical entities or fragments are 
screened and selected for their ability to associate with an individual binding site 
5 or other area of a crystalline GRa LBD polypeptide of the present invention and to 
interact with the amino acids disposed in the binding sites. 

Interacting amino acids forming contacts with a ligand and the atoms of the 
interacting amino acids are usually 2 to 4 angstroms away from the center of the 
atoms of the ligand. Generally these distances are determined by computer as 

10 discussed herein and in McRee (McRee, (1993) Practical Protein Crystallography , 
Academic Press, New York), however distances can be determined manually 
once the three dimensional model is made. More commonly, the atoms of the 
ligand and the atoms of interacting amino acids are 3 to 4 angstroms apart. A 
ligand can also interact with distant amino acids, after chemical modification of the 

1 5 ligand to create a new ligand. Distant amino acids are generally not in contact with 
the ligand before chemical modification. A chemical modification can change the 
structure of the ligand to make as new ligand that interacts with a distant amino 
acid usually at least 4.5 angstroms away from the ligand. Often distant amino 
acids will not line the surface of the binding cavity for the ligand, as they are too 

20 far away from the ligand to be part of a pocket or surface of the binding cavity. 

A variety of methods can be used to screen chemical entities or fragments 
for their ability to associate with an NR, SR or GR LBD and, more particularly, with 
the individual binding sites of an NR, SR or GR LBD, such as ligand binding 
pocket or an accessory binding site. This process can begin by visual inspection 

25 of, for example, the ligand binding pocket on a computer screen based on the 
GRa LBD atomic coordinates in Table 4, as described herein. Selected 
fragments or chemical entities can then be positioned in a variety of orientations, 
or docked, within an individual binding site of a GRa LBD as defined herein 
above. Docking can be accomplished using software programs such as those 

30 available under the tradenames QUANTA™ (Molecular Simulations Inc., San 
Diego, California) and SYBYL™ (Tripos, Inc., St. Louis, Missouri), followed by 
energy minimization and molecular dynamics with standard molecular mechanics 
forcefields, such as CHARM ( Brooks et al. , (1983) J. Comp. Chem., 8: 132) and 
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AMBER 5 (Case et al. , (1997), AMBER 5, University of California, San Francisco; 
Pearlman et al., (1995) Comput. Phys. Commun. 91: 1-41). 

Specialized computer programs can also assist in the process of selecting 
fragments or chemical entities. These include: 

1. GRID™ program, version 17 (Goodford . (1985) J. Med. Chem. 28: 849- 
57), which is available from Molecular Discovery Ltd., Oxford, UK; 

2. MCSS™ program ( Miranker & Karp lus. (1991) Proteins 11: 29-34), 
which is available from Molecular Simulations, Inc., San Diego, California; 

3. AUTODOCK™ 3.0 program (Goodsell & Olsen . (1990) Proteins 8: 195- 
202), which is available from the Scripps Research Institute, La Jolla, California; 

4. DOCK™ 4.0 program (Kuntz et al. . (1992) J. Mol. Biol. 161: 269-88), 
which is available from the University of California, San Francisco, California; 

5. FLEX-X™ program (See, Rarey et al. . (1996) J. Comput. Aid. Mol. Des. 
10:41-54), which is available from Tripos, Inc., St. Louis, Missouri; 

6. MVP program (Lambert, (1997) in Practical Application of Computer- 
Aided Drug Desiqn , (Charifson, ed.) Marcel-Dekker, New York, pp. 243-303); and 

7. LUDI™ program (Bohm, (1992) J. Comput. Aid. Mol. Des., 6: 61-78), 
which is available from Molecular Simulations, Inc., San Diego, California. 

Once suitable chemical entities or fragments have been selected, they can 
be assembled into a single compound or modulator. Assembly can proceed by 
visual inspection of the relationship of the fragments to each other on the three- 
dimensional image displayed on a computer screen in relation to the structure 
coordinates of a GRcc LBD. Manual model building using software such as 
QUANTA™ or SYBYL™ typically follows. 

Useful programs to aid one of ordinary skill in the art in connecting the 
individual chemical entities or fragments include: 

1. CAVEAT™ program (Bartlett et al. . (1989) Special Pub., Royal Chem. 
Soc. 78: 182-96), which is available from the University of California, Berkeley, 
California; 

2. 3D Database systems, such as MACCS-3D™ system program, which is 
available from MDL Information Systems. San Leandro, California. This area is 
reviewed in Martin, (1992) J. Med. Chem. 35: 2145-54; and 
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3. HOOK™ program ( Eisen et al. t (1994). Proteins 19: 199-221), which is 
available from Molecular Simulations, Inc., San Diego, California. 

Instead of proceeding to build a GR LBD modulator (preferably a GRa LBD 
modulator) in a step-wise fashion one fragment or chemical entity at a time as 
5 described above, modulatory or other binding compounds can be designed as a 
whole or de novo using the structural coordinates of a crystalline GRa LBD 
polypeptide of the present invention and either an empty binding site or optionally 
including some portion(s) of a known modulator(s). Applicable methods can 
employ the following software programs: 
10 1. LUDI™ program (Bohm, (1992) J. Comput. Aid Mol. Des. t 6: 61-78), 

which is available from Molecular Simulations, Inc., San Diego, California; 

2. LEGEND™ program ( Nishibata & Itai , (1991) Tetrahedron 47: 8985); 

and 

3. LEAPFROG™, which is available from Tripos Associates, St. Louis, 
15 Missouri. 

Other molecular modeling techniques can also be employed in accordance 
with this invention. See, e.g. , Cohen et at. , (1990) J. Med. Chem. 33: 883-94. 
See also , Navia & Murcko , (1992) Curr. Opin. Struc. Biol. 2: 202-10; U.S. Patent 
No. 6,008,033, herein incorporated by reference. 

20 Once a compound has been designed or selected by the above methods, 

the efficiency with which that compound can bind to a NR, SR or GR LBD can be 
tested and optimized by computational evaluation. By way of particular example, 
a compound that has been designed or selected to function as a NR, SR or GR 
LBD modulator should also preferably traverse a volume not overlapping that 

25 occupied by the binding site when it is bound to its native ligand. Additionally, an 
effective NR, SR or GR LBD modulator should preferably demonstrate a relatively 
small difference in energy between its bound and free states (i.e., a small 
deformation energy of binding). Thus, the most efficient NR, SR and GR LBD 
modulators should preferably be designed with a deformation energy of binding of 

30 not greater than about 10 kcal/mole, and preferably, not greater than 7 kcal/mole. 
It is possible for NR, SR and GR LBD modulators to interact with the polypeptide 
in more than one conformation that is similar in overall binding energy. In those 
cases, the deformation energy of binding is taken to be the difference between the 
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energy of the free compound and the average energy of the conformations 
observed when the modulator binds to the polypeptide. 

A compound designed or selected as binding to an NR, SR or GR 
polypeptide (preferably a GRcc LBD polypeptide) can be further computationally 
optimized so that in its bound state it would preferably lack repulsive electrostatic 
-nteraction with the target polypeptide. Such non-complementary (eg 
electrostatic) interactions include repulsive charge-charge, dipole-dipole and 
charge-dipole interactions. Specifically, the sum of all electrostatic interactions 
between the modulator and the polypeptide when the modulator is bound to an 
NR, SR or GR LBD preferably make a neutral or favorable contribution to the 
enthalpy of binding. 

Specific computer software is available in the art to evaluate compound 
deformation energy and electrostatic interaction. Examples of programs designed 
for such uses include: 

1. Gaussian 98™, which is available from Gaussian, Inc., Pittsburgh, 
Pennsylvania; 

2. AMBER™ program, version 6.0, which is available from the University 
of California at San Francisco; 

3. QUANTA™ program, which is available from Molecular Simulations, 
20 Inc., San Diego, California; 

4. CHARMm® program, which is available from Molecular Simulations, 
Inc., San Diego, California; and 

4. Insight II® program, which is available from Molecular Simulations, Inc., 
San Diego, California. 

These programs can be implemented using a suitable computer system. 
Other hardware systems and software packages will be apparent to those skilled 
in the art after review of the disclosure of the present invention presented herein. 

Once an NR, SR or GR LBD modulating compound has been optimally 
selected or designed, as described above, substitutions can then be made in 
some of its atoms or side groups in order to improve or modify its binding 
properties. Generally, initial substitutions are conservative, i.e., the replacement 
group will have approximately the same size, shape, hydrophobicity and charge 
as the original group. It should, of course, be understood that components known 
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in the art to alter conformation are preferably avoided. Such substituted chemical 
compounds can then be analyzed for efficiency of fit to an NR, SR or GR LBD 
binding site using the same computer-based approaches described in detail 
above. 

5 

IX.B. Distinguishing Between GR Subtypes and Between NRs 
The present invention also is applicable to generating new synthetic ligands 
to distinguish nuclear receptor subtypes. As described herein, modulators can be 
generated that distinguish between subtypes, thereby allowing the generation of 

10 either tissue specific or function specific synthetic ligands. For instance, the GRa 
gene can be translated from its mRNA by alternative initiation from an internal 
ATG codon (Yudt & Cidlowski (2001) Molec. Endocrinol. 15: 1093-1103). This 
codon codes for methionine at position 27 and translation from this position 
produces a slightly smaller protein. These two isoforms, translated from the same 

15 gene, are referred to as GR-A and GR-B. It has been shown in a cellular system 
that the shorter GR-B form is more effective in initiating transcription from a GRE 
compared to GR-A. Additionally, another form of GR, called GRp is produced by 
an alternative splicing event. The GRp protein differs from GRa at the very C- 
terminus, where the final 50 amino acids are replaced with a 15 amino acid 

20 segment. These two isoforms are 100% identical up to amino acid 727. No 
sequence similarity exists between GRa and GRp at the C-terminus beyond 
position 727. GRp has been shown to be a dominant negative regulator of GRa- 
mediated gene transcription (Oakley, Sar & Cidlowski (1996) J. Biol. Chem. 271: 
9550-9559). It has been suggested that some of the tissue specific effects 

25 observed with glucocorticoid treatment may in part be due to the presence of 
varying amounts of isoform in certain cell-types. This method is also applicable to 
any other subfamily so organized. 

The present invention discloses the ability to generate new synthetic 
ligands to distinguish between GR subtypes. As described herein, computer- 

30 designed ligands (i.e. candidate modulators and modulators) can be generated 
that distinguish between GR subtypes, thereby allowing the generation of either 
tissue specific or function specific ligands. The atomic structural coordinates 
disclosed in the present invention reveal structural details unique to GRa. These 



10 



15 



20 



25 



30 



WO 03/015692 

PCT/US02/22648 

-67- 

structural detai.s can be exploited when a novel ligand is designed using the 
methods of the present invention or other ligand design methods known in the art 
The structural features that dtfferentiate, for example, a GRa from a GRp can be 
targeted in ligand design. Thus, for example, a ligand can be designed that will 
recogmze GRa, while not interacting with other GRs or even with moieties having 
s,m,lar structural features. Prior to the disclosure of the present invention, the 
abilrty to target a GR subtype was unattainable. 

The present invention also pertains to a method for designing an agonist or 
modulator with desired levels of activity on at least two subtypes, GRa and GRp 
In a preferred embodiment, the method comprises obtaining atomic coordinates 
for structures of the GRa and/or GRp ligand binding domains. The structures can 
comprise GRa and GRp, each bound to various different ligands, and also can 
comprise structures where no ligand is present. The structures can also comprise 
models where a compound has been docked into a particular GR using a 
molecular docking procedure, such as the MVP program disclosed herein 
Opbonally, the structures are rotated and translated so as to superimpose 
corresponding Ca or backbone atoms; this facilitates the comparison of 
structures. 

The GRa and GRp structures can also be compared using a computer 
graphics system to identify regions of the ligand binding site that have similar 
shape and electrostatic character, and to identify regions of the ligand binding site 
that are narrowed or constricted in one or both of the GRs, particularly as 
compared to other NRs. Since these three GRs are subject to conformational 
changes, attention is paid to the range of motion observed for each protein atom 
over the whole collection of structures. The ligand structures, including both those 
determined by X-ray crystallography and those modeled using molecular docking 
procedures, can be examined using a computer graphics system to identify 
ligands where a chemical modification could increase or decrease binding to a 
particular GR, or decrease activity against a particular GR. Additionally or 
alternatively, the chemical modification can introduce a group into a volume that is 
normally occupied by an atom of that GR. 

Optionally, to selectively decrease activity against a particular GR the 
chemical modification can be made so as to occupy volume that is normally 
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occupied by atoms of that particular GR, but not by atoms of the other GRs. To 
increase activity against a particular GR, a chemical modification can be made 
that improves interactions with that particular GR. To selectively increase activity 
against a particular GR, a chemical modification can be made that improves the 
5 interactions with that particular GR, but does not improve the interactions with the 
other GRs. Other design principles can also be used to increase or decrease 
activity on a particular GR. 

Thus, various possible compounds and chemical modfications can be 
considered and compared graphically, and with molecular modeling tools, for 

10 synthetic feasibility and likelihood of achieving the desired profile of activation of 
GRa and GRp. Compounds that appear synthetically feasible and that have a 
good likelihood of achieving the desired profile are synthesized. The compounds 
can then be tested for binding and/or activation of GRa and GRp, and tested for 
their overall biological effect. 

15 A method of identifying a NR modulator that selectively modulates the 

biological activity of one NR compared to GRa is also disclosed. In one 
embodiment, the method comprises: (a) providing an atomic structure coordinate 
set describing a GRa ligand binding domain structure and at least one other 
atomic structure coordinate set describing a NR ligand binding domain, each 

20 ligand binding domain comprising a ligand binding site; (b) comparing the atomic 
structure coordinate sets to identify at least one diference between the sets; (c) 
designing a candidate ligand predicted to interact with the difference of step (b); 
(d) synthesizing the candidate ligand; and (e) testing the synthesized candidate 
ligand for an ability to selectively modulate a NR as compared to GRa, whereby a 

25 NR modulator that selectively modulates the biological activity NR compared to 
GRa is identified. 

Preferably, the GRa atomic structure coordinate set is the atomic structure 
coordinate set shown in Table 4. Optionally, the NR is selected from the group 
consisting of MR, PR, AR, GRp and isoforms thereof that have ligands that also 
30 bind GRa.~ 
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— Method of S cr eening for Chemical and Biolog ica l Modulators of the 

Biological Activity of an MR, SR or GR 
A candidate substance identified according to a screening assay of the 
present invention has an ability to modulate the biological activity of an NR SR or 
GR or an NR, SR or GR LBD polypeptide. In a preferred embodiment, such a 
cand.date compound can have utility in the treatment of disorders and/or 
conditions and/or biological events associated with the biological activity of an NR, 

SR or GR or an NR, SR or GR LBD polypeptide, including transcription 
modulation. 

In a cell-free system, the method comprises the steps of establishing a 
control system comprising a GRa polypeptide and a ligand which is capable of 
binding to the polypeptide; establishing a test system comprising a GRa 
polypeptide, the ligand, and a candidate compound; and determining whether the 
candidate compound modulates the activity of the polypeptide by comparison of 
15 the test and control systems. A representative ligand can comprise 
dexamethasone or other small molecule, and in this embodiment, the biological 
activity or property screened can include binding affinity or transcription regulation. 
The GRa polypeptide can be in soluble or crystalline form. 

In another embodiment of the invention, a soluble or a crystalline form of a 
GRa polypeptide or a catalytic or immunogenic fragment or oligopeptide thereof, 
can be used for screening libraries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such a screening can be affixed 
to a solid support. The formation of binding complexes, between a soluble or a 
crystalline GRa polypeptide and the agent being tested, will be detected. In a 
preferred embodiment, the soluble or crystalline GRa polypeptide has an amino 
acid sequence of any of SEQ ID NOs:4, 6, 8 or 10. When a GRa LBD 
polypeptide is employed, a preferred embodiment will include a soluble or a 
crystalline GRa polypeptide having the amino acid sequence of any of SEQ ID 
NOs:12, 14, 16 or 31. 

Another technique for drug screening which can be used provides for high 
throughput screening of compounds having suitable binding affinity to the protein 
of interest as described in published PCT application WO 84/03564, herein 
incorporated by reference. In this method, as applied to a soluble or crystalline 
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polypeptide of the present invention, large numbers of different small test 
compounds are synthesized on a solid substrate, such as plastic pins or some 
other surface. The test compounds are reacted with the soluble or crystalline 
polypeptide, or fragments thereof. Bound polypeptide is then detected by 
5 methods known to those of skill in the art. The soluble or crystalline polypeptide 

can also be placed directly onto plates for use in the aforementioned drug 

« 

screening techniques. 

In yet another embodiment, a method of screening for a modulator of an 
NR, SR or GR or an NR, SR or GR LBD polypeptide comprises: providing a library 

1 0 of test samples; contacting a soluble or a crystalline form of an NR, SR or GR or a 
soluble or crystalline form of an NR, SR or GR LBD polypeptide with each test 
sample; detecting an interaction between a test sample and a soluble or a 
crystalline form of an NR, SR or GR or a soluble or a crystalline form of an NR, 
SR or GR LBD polypeptide; identifying a test sample that interacts with a soluble 

15 or a crystalline form of an NR, SR or GR or a soluble or a crystalline form of an 
NR, SR or GR LBD polypeptide; and isolating a test sample that interacts with a 
soluble or a crystalline form of an NR, SR or GR or a soluble or a crystalline form 
of an NR, SR or GR LBD polypeptide. 

In each of the foregoing embodiments, an interaction can be detected 

20 spectrophotometrically, radiologically, colorimetrically or immunologically. An 
interaction between a soluble or a crystalline form of an NR, SR or GR or a 
soluble or a crystalline form of an NR, SR or GR LBD polypeptide and a test 
sample can also be quantified using methodology known to those of skill in the art. 
In accordance with the present invention there is also provided a rapid and 

25 high throughput screening method that relies on the methods described above. 
This screening method comprises separately contacting each of a plurality of 
substantially identical samples with a soluble or a crystalline form of an NR, SR or 
GR or a soluble or a crystalline form of an NR, SR or GR LBD and detecting a 
resulting binding complex. In such a screening method the plurality of samples 

30 preferably comprises more than about 10 4 samples, or more preferably comprises 
more than about 5 x 1 0 4 samples. 

In another embodiment, a method for identifying a substance that 
modulates GR LBD function is also provided. In a preferred embodiment, the 
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method comprises: (a) isolating a GR polypeptide of the present invention; (b) 
exposing the isolated GR polypeptide to a plurality of substances; (c) assaying 
binding of a substance to the isolated GR polypeptide; and (d) selecting a 
substance that demonstrates specific binding to the isolated GR LBD polypeptide. 
By the term "exposing the GR polypeptide to a plurality of substances", it is meant 
both in pools and as mutiple samples of "discrete" pure substances. 

l^P-: Method of Ide ntifying Compounds Which Inhibit Ligand Binding 
In one aspect of the present invention, an assay method for identifying a 
compound that inhibits binding of a ligand to an NR, SR or GR polypeptide is 
disclosed. A ligand, such as dexamethasone (which associates with at least GR), 
can be used in the assay method as the ligand against which the inhibition by a 
test compound is gauged. In the following discussion of Section IX.D., it will be 
understood that although GR is used as an example, the method is equally 
applicable to any of NR, SR or GR polypeptide The method comprises (a) 
incubating a GR polypeptide with a ligand in the presence of a test inhibitor 
compound; (b) determining an amount of ligand that is bound to the GR 
polypeptide, wherein decreased binding of ligand to the GR polypeptide in the 
presence of the test inhibitor compound relative to binding in the absence of the 
20 test inhibitor compound is indicative of inhibition; and (c) identifying the test 
compound as an inhibitor of ligand binding if decreased ligand binding is 
observed. Preferably, the ligand is dexamethasone. 

In another aspect of the present invention, the disclosed assay method can 
be used in the structural refinement of candidate GR inhibitors. For example, 
25 multiple rounds of optimization can be followed by gradual structural changes in a 
strategy of inhibitor design. A strategy such as this is made possible by the 
disclosure of the atomic coordinates of the GRoc LBD. 

^ Design, Prep aration and Structural Analysis of Additional NR, SR and GR 
30 Polypeptides a nd NR, SR and GR LBD Mutants and Structural Equivalents 

The present invention provides for the generation of NR, SR and GR 
polypeptides and NR, SR or GR mutants (preferably GRoc and GRa LBD 
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mutants), and the ability to solve the crystal structures of those that crystallize. 
Indeed, a GRa LBD havingfa point mutation was crystallized and solved in one 
aspect of the present invention. Thus, an aspect of the present invention involves 
the use of both targeted and random mutagenesis of the GR gene for the 
5 production of a recombinant protein with improved or desired characteristics for 
the purpose of crystallization, characterization of biologically relevant protein- 
protein interactions, and compound screening assays, or for the production of a 
recombinant protein having other desirable characteristic(s). Polypeptide 
products produced by the methods of the present invention are also disclosed 
10 herein. 

The structure coordinates of a NR, SR or GR LBD provided in accordance 
with the present invention also facilitate the identification of related proteins or 
enzymes analogous to GRa in function, structure or both, (for example, a GRp) 
which can lead to novel therapeutic modes for treating or preventing a range of 
15 disease states. More particularly, through the provision of the mutagenesis 
approaches as well as the three-dimensional structure of a GRa LBD disclosed 
herein, desirable sites for mutation are identified. 

4 

X.A. Sterically Similar Compounds 
20 A further aspect of the present invention is that sterically similar 

compounds can be formulated to mimic the key portions of an NR, SR or GR LBD 

i 

structure. Such compounds are functional equivalents. The generation of a 
structural functional equivalent can be achieved by the techniques of modeling 
and chemical design known to those of skill in the art and described herein. 
25 Modeling and chemical design of NR, SR or GR and NR, SR or GR LBD structural 
equivalents can be based on the structure coordinates of a crystalline GRa LBD 
polypeptide of the present invention. It will be understood that all such sterically 
similar constructs fall within the scope of the present invention. 

30 XJ5. NR, SR and GR Polypeptides 

The generation of chimeric GR polypeptides is also an aspect of the 
present invention. Such a chimeric polypeptide can comprise an NR, SR or GR 
LBD polypeptide or a portion of an NR, SR or GR LBD, (e.g. a GRa LBD) that is 
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fused to a candidate polypeptide or a suitable region of the candidate polypeptide, 
for example GRp. Throughout the present disclosure it is intended that the term 
"mutant" encompass not only mutants of an NR, SR or GR LBD polypeptide but 
chimeric proteins generated using an NR, SR or GR LBD as well. It is thus 
intended that the following discussion of mutant NR, SR and GR LBDs apply 
mutatis mutandis to chimeric NR, SR and GR polypeptides and NR, SR and GR 
LBD polypeptides and to structural equivalents thereof. 

In accordance with the present invention, a mutation can be directed to a 
particular site or combination of sites of a wild-type NR, SR or GR LBD. For 
example, an accessory binding site or the binding pocket can be chosen for 
mutagenesis. Similarly, a residue having a location on, at or near the surface of 
the polypeptide can be replaced, resulting in an altered surface charge of one or 
more charge units, as compared to the wild-type NR, SR or GR and NR, SR or 
GR LBDs. Alternatively, an amino acid residue in an NR, SR or GR or an NR, SR 
15 or GR LBD can be chosen for replacement based on its hydrophilic or 
hydrophobic characteristics. 

Such mutants can be characterized by any one of several different 

■ 

properties, i.e. a "desired" or "predetermined" characteristic as compared with the 
wild type NR, SR or GR LBD. For example, such mutants can have an altered 
20 surface charge of one or more charge units, or can have an increase in overall 
stability. Other mutants can have altered substrate specificity in comparison with, 
or a higher specific activity than, a wild-type NR, SR or GR or an NR, SR or GR 
LBD. 

NR, SR or GR and NR, SR or GR LBD mutants of the present invention 
25 can be generated in a number of ways. For example, the wild-type sequence of 
an NR, SR or GR or an NR, SR or GR LBD can be mutated at those sites 
identified using this invention as desirable for mutation, by means of 
oligonucleotide-directed mutagenesis or other conventional methods, such as 
deletion. Alternatively, mutants of an NR, SR or GR or an NR, SR or GR LBD can 
30 be generated by the site-specific replacement of a particular amino acid with an 
unnaturally occurring amino acid. In addition, NR, SR or GR or NR, SR or GR 
LBD mutants can be generated through replacement of an amino acid residue, for 
example, a particular cysteine or methionine residue, with selenocysteine or 
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selenomethionine. This can be achieved by growing a host organism capable of 
expressing either the wild-type or mutant polypeptide on a growth medium 
depleted of either natural cysteine or methionine (or both) but enriched in 
selenocysteine or selenomethionine (or both). 
5 As disclosed in the Examples presented below, mutations can be 

introduced into a DNA sequence coding for an NR f SR or GR or an NR, SR or GR 
LBD using, synthetic oligonucleotides. These oligonucleotides contain nucleotide 
sequences flanking the desired mutation sites. Mutations can be generated in the 
full-length DNA sequence of an NR, SR or GR or an NR f SR or GR LBD or in any 
10 sequence coding for polypeptide fragments of an NR, SR or GR or an NR, SR or 
GR LBD. 

According to the present invention, a mutated NR, SR or GR or NR, SR or 
GR LBD DNA sequence produced by the methods described above, or any 
alternative methods known in the art, can be expressed using an expression 

15 vector. An expression vector, as is well known to those of skill in the art, typically 
includes elements that permit autonomous replication in a host cell independent of 
the host genome, and one or more phenotypic markers for selection purposes. 
Either prior to or after insertion of the DNA sequences surrounding the desired 
NR, SR or GR or NR, SR or GR LBD mutant coding sequence, an expression 

20 vector also will include control sequences encoding a promoter, operator, 
ribosome binding site, translation initiation signal, and, optionally, a repressor 
gene or various activator genes and a signal for termination. In some 
embodiments, where secretion of the produced mutant is desired, nucleotides 
encoding a "signal sequence" can be inserted prior to an NR, SR or GR or an NR, 

25 SR or GR LBD mutant coding sequence. For expression under the direction of 
the control sequences, a desired DNA sequence must be operatively linked to the 
control sequences; that is, the sequence must have an appropriate start signal in 
front of the DNA sequence encoding the NR, SR or GR or NR, SR or GR LBD 
mutant, and the correct reading frame to permit expression of that sequence 

30 under the control of the control sequences and production of the desired product 
encoded by that NR, SR or GR or NR, SR or GR LBD sequence must be 
maintained. 
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After a review of the disclosure of the present invention presented herein, 
any of a wide variety of well-known available expression vectors can be useful to 
express a mutated coding sequence of this invention. These include for example, 
vectors consisting of segments of chromosomal, non-chromosomal and synthetic 
5 DNA sequences, such as various known derivatives of SV40, known bacterial 
plasmids, e.g., plasmids from £ coli including col E1, pCR1, pBR322, pMB9 and . 
their derivatives, wider host range plasmids, e.g., RP4, phage DNAs, e.g., the 
numerous derivatives of phage k, e.g., NM 989, and other DNA phages, e.g., M13 
and filamentous single stranded DNA phages, yeast plasmids and vectors derived 

10 from combinations of plasmids and phage DNAs, such as plasmids which have 
been modified to employ phage DNA or other expression control sequences. In 
the preferred embodiments of this invention, vectors amenable to expression in a 
pET-based expression system are employed. The pET expression system is 
available from Novagen/lnvitrogen, Inc., Carlsbad, California. Expression and 

1 5 screening of a polypeptide of the present invention in bacteria, preferably £ co//, 
is a preferred aspect of the present invention. 

In addition, any of a wide variety of expression control sequences — 
sequences that control the expression of a DNA sequence when operatively 
linked to it — can be used in these vectors to express the mutated DNA sequences 

20 according to this invention. Such useful expression control sequences, include, 
for example, the early and late promoters of SV40 for animal cells, the lac system, 
the trp system the TAC or TRC system, the major operator and promoter regions 
of phage X, the control regions of fd coat protein, all for £ co//, the promoter for 3- 
phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid 

25 phosphatase, e.g., Pho5, the promoters of the yeast a-mating factors for yeast, 
and other sequences known to control the expression of genes of prokaryotic or 
eukaryotic cells or their viruses, and various combinations thereof. 

A wide variety of hosts are also useful for producing mutated NR, SR or GR 
and NR, SR or GR LBD polypeptides according to this invention. These hosts 

30 include, for example, bacteria, such as £ co//, Bacillus and Streptomyces, fungi, 
such as yeasts, and animal cells, such as CHO and COS-1 cells, plant cells, 
insect cells, such as SF9 cells, and transgenic host cells. Expression and 
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screening of a polypeptide of the present invention in bacteria, preferably E. coli 9 
is a preferred aspect of the present invention. 

It should be understood that not all expression vectors and expression 
systems function in the same way to express mutated DNA sequences of this 
5 invention, and to produce modified NR, SR or GR and NR, SR or GR LBD 
polypeptides or NR, SR or GR or NR, SR or GR LBD mutants. Neither do all 
hosts function equally well with the same expression system. One of skill in the 
art can, however, make a selection among these vectors, expression control 
sequences and hosts without undue experimentation and without departing from 
10 the scope of this invention. For example, an important consideration in selecting a 
vector will be the ability of the vector to replicate in a given host. The copy 
number of the vector, the ability to control that copy number, and the expression 
of any other proteins encoded by the vector, such as antibiotic markers, should 
also be considered. 

15 In selecting an expression control sequence, a variety of factors should 

also be considered. These include, for example, the relative strength of the 
system, its controllability and its compatibility with the DNA sequence encoding a 
modified NR, SR or GR or NR, SR or GR LBD polypeptide of this invention, with 
particular regard to the formation of potential secondary and tertiary structures. 

20 Hosts should be selected by consideration of their compatibility with the 

chosen vector, the toxicity of a modified polypeptide to them, their ability to 
express mature products, their ability to fold proteins correctly, their fermentation 
requirements, the ease of purification of a modified GR or GR LBD and safety. 
Within these parameters, one of skill in the art can select various 

25 vector/expression control system/host combinations that will produce useful 
amounts of a mutant polypeptide. A mutant polypeptide produced in these 
systems can be purified, for example, via the approaches disclosed in the 
Examples. 

Once a mutation(s) has been generated in the desired location, such as an 
30 active site or dimerization site, the mutants can be tested for any one of several 
properties of interest, i.e. "desired" or "predetermined" positions. For example, 
mutants can be screened for an altered charge at physiological pH. This property 
can be determined by measuring the mutant polypeptide isoelectric point (pi) and 



WO 03/015692 PCT/US02/22648 

-77- 

comparing the observed value with that of the wild-type parent. Isoelectric point 
can be measured by gel-electrophoresis according to the method of Wellner 
(Wellner, (1971) Anal. Chem. 43: 597). A mutant polypeptide containing a 
replacement amino acid located at the surface of the enzyme, as provided by the 
5 structural information of this invention, can lead to an altered surface charge and 
an altered pi. 

X-C- Generation of an Engineered NR, SR or GR or NR, SR or GR LBD 
Mutants 

1 0 In another aspect of the present invention, a unique NR, SR or GR or NR, 

SR or GR LBD polypeptide is generated. Such a mutant can facilitate purification 
and the study of the structure and the ligand-binding abilities of a NR, SR or GR 
polypeptide. Thus, an aspect of the present invention involves the use of both 
targeted and random mutagenesis of the GR gene for the production of a 

15 recombinant protein with improved solution characteristics for the purpose of 
crystallization, characterization of biologically relevant protein-protein interactions, 
and compound screening assays , or for the production of a recombinant 
polypeptide having other characteristics of interest. Expression of the polypeptide 
in bacteria, preferably E. coli, is also an aspect of the present invention. . 

20 ,n on e embodiment, targeted mutagenesis was performed using a 

sequence alignment of several nuclear receptors, primarily steroid receptors. 
Several residues that were hydrophobic in GR and hydrophilic in other receptors 
were chosen for mutagenesis. Most of these residues were predicted to be 
solvent exposed hydrophobic residues in GR. Therefore, mutations were made to 

25 change these hydrophobic residues to hydrophilic in attempt to improve the 
solubility and stability of E.co//-expressed GR LBD. Table 2 immediately below 
presents a list of mutations (for that were made and tested for expression in E. 
coli. 
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Table 2 

Mutations of the GR LBD (521-777) Gene for 
Testing Solution Solubility and Stability 



Single mutations 



Double mutations Triple mutations 



V552K 



W557S 



L535T/V538S 
V552K/W557S 



M691T/V702T/W712T 



F602S 



L636E/C638S 



F602D 



F602E 



L636E 



Y648Q 



W712S 



L741R 



F602Y 



F602T 



F602N 



F602C 
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Random mutagenesis can be performed on residues where a significant 
difference, hydrophobic versus hydrophilic, is observed between GR and other 
steroid receptors based on sequence alignment. Such positions can be 
randomized by oligo-directed or cassette mutagenesis. A GR LBD protein library 
can be sorted by an appropriate display system to select mutants with improved 
solution properties. Residues in GR that meet the criteria for such an approach 
include: V538, V552, W557. F602. L636, Y648, Y660, L685, M691, V702, W712, 
L733, and Y764. In addition, residues predicted to neighbor these positions could 
also be randomized. 

In another embodiment, complete random mutagenesis can be performed 
on any residue within the context of the GR LBD. A method such as error 
incorporating PCR or chemical-based mutagenesis can be used to introduce 
mutations in an unbiased manner. These methods randomize the position of 
mutation as well as the nature of the mutated residue. A completely random GR 
LBD library can be screened for improved expression with the appropriate 
expression or display system. Ideally, the selection method should identify mutant 
proteins with increased expression, solubility, stability, and/or activity. A 
technique well suited for this purpose is the "peptides-on-plasmid" display system 
that utilizes the DNA-binding activity of the lac repressor (Lad). GR, or another 
nuclear receptor LBD. can be expressed as a fusion to either Lad or a fragment of 
Lad, such as the "headpiece dimer", that comprises the DNA-binding domain. 
Because the plasmid that expresses the fusion protein also comprises a lac 
operon binding site, the protein will be physically coupled to the plasmid. GR 
mutants that produce soluble protein can then be isolated using either the 
coactivator peptide- or ligand-binding activity of the receptor. Table 2A below 
shows mutations that were prepared using the Lacl-based "peptides-on-plasmids" 
technique with GR LBD. 



Table 2A 

30 



R andom Mutations of the GR LBD (521-777) Gene for Improving 

Solution Solubility and Stability 
Single mutations SEQ ID NO Double Mutations SEQ ID NO 

W557R 33. F602L/A580T 38 
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Q615L 



34 



L563F/G583C 



39 



Q615H 



35 



L664H/M752T 



40 



A574T 



36 



L563F/T744N 



41 



L620M 



37 



A method of modifying a test NR polypeptide is thus disclosed. The 
method can comprise: providing a test NR polypeptide sequence having a 
characteristic that is targeted for modification; aligning the test NR polypeptide 
sequence with at least one reference NR polypeptide sequence for which an X-ray 
structure is available, wherein the at least one reference NR polypeptide 
sequence has a characteristic that is desired for the test NR polypeptide; building 
a three-dimensional model for the test NR polypeptide using the three- 
dimensional coordinates of the X-ray structure(s) of the at least one reference 
polypeptide and its sequence alignment with the test NR polypeptide sequence; 
examining the three-dimensional model of the test NR polypeptide for differences 
with the at least one reference polypeptide that are associated with the desired 
characteristic; and mutating at least one amino acid residue in the test NR 
polypeptide sequence located at a difference identified above to a residue 
associated with the desired characteristic, whereby the test NR polypeptide is 
modified. By the term "associated with a desired characteristic" it is meant that a 
residue is found in the reference polypeptide at a point of difference wherein the 
difference provides a desired characteristic or phenotype in the reference 
polypeptide. 

A method of altering the solubility of a test NR polypeptide is also disclosed 
in accordance with the present invention. In a preferred embodiment, the method 
comprises: (a) providing a reference NR polypeptide sequence and a test NR 
polypeptide sequence; (b) comparing the reference NR polypeptide sequence and 
the test NR polypeptide sequence to identify one or more residues in the test NR 
sequence that are more or less hydrophilic than a corresponding residue in the 
reference NR polypeptide sequence; and (c) mutating the residue in the test NR 
polypeptide sequence identified in step (b) to a residue having a different 
hydrophilicity, whereby the solubility of the test NR polypeptide is altered. 
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By the term "altering" it is meant any change in the solubility of the test NR 
polypeptide, including preferably a change to make the polypeptide more soluble. 
Such approaches to obtain soluble proteins for crystallization studies have been 
successfully demonstrated in the case of HIV integration intergrase and the 
human leptin cytokine. See Dyda, F., et al., Science (1994) Dec 23; 
266(5193):1981-6; and Zhang et al., Nature (1997) May 8; 387(6629):206-9. 

Typically, such a change involves substituting a residue that is more 
hydrophilic than the wild type residue. Hydrophobicity and hydrophilicity criteria 
and comparision information are set forth herein below. Optionally, the reference 
NR polypeptide sequence is an AR or a PR sequence, and the test polypeptide 
sequence is a GR polypeptide sequence. Alternatively, the reference polypeptide 
sequence is a crystalline GR LBD. The comparing of step (b) is preferably by 
sequence alignment. More preferably, the screening is carried out in bacteria, 
even more preferably, in E. coli. 

A method for modifying a test NR polypeptide to alter and preferably 
improve the solubility, stability in solution and other solution behavior, to alter and 
preferably improve the folding and stability of the folded structure, and to alter and 
preferably improve the ability to form ordered crystals is also provided in 
accordance with the present invention. The aforementioned characteristics are 
representative "desired" or "predetermined characteristics or phenotypes. 
In a preferred embodiment, the method comprises: 

(a) providing a test NR polypeptide sequence for which the solubility, 
stability in solution, other solution behavior, tendency to fold properly, ability to 
form ordered crystals, or combination thereof is different from that desired; 

(b) aligning the test NR polypeptide sequence with the sequences of other 
reference NR polypeptides for which the X-ray structure is available and for which 
the solution properties, folding behavior and crystallization properties are closer to 
those desired; 

(c) building a three-dimensional model for the test NR polypeptide using the 
) three-dimensional coordinates of the X-ray structure(s) of one or more of the 

reference polypeptides and their sequence alignment with the test NR polypetide 
sequence; 

(d) optionally, optimizing the side-chain conformations in the three- 
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dimensional model by generating many alternative side-chain conformations, 
refining by energy minimization, and selecting side-chain conformations with lower 
energy; 

(e) examining the three-dimensional model for the test NR graphically for 
5 lipophilic side-chains that are exposed to solvent, for clusters of two or more 
lipophilic side-chains exposed to solvent, for lipophilic pockets and clefts on the 
surface of the protein model, and in particular for sites on the surface of the 
protein model that are more lipophilic than the corresponding sites on the 
structure(s) of the reference NR polypeptide(s); 
10 (f) for each residue identified in step (e), mutating the amino acid to an 

amino acid with different hydrophilicity, and usually to a more hydrophilic amino 
acid, whereby the exposed lipophilic sites are reduced, and the solution properties 
improved; 

(g) examining the three-dimensional model graphically at each site where 
15 the amino acid in the test NR polypeptide is different from the amino acid at the 
corresponding position in the reference NR polypeptide, and checking whether the 
amino acid in the test NR polypeptide makes favorable interactions with the atoms 
that lie around it in the three-dimensional model, considering the side-chain 
conformations predicted in steps (c) and, optionally step (d), as well as likely 
20 alternative conformations of the side-chains, and also considering the possible 
presence of water molecules (for this analysis, an amino acid is considered to 
make "favorable interactions with the atoms that lie around it" if these interactions 
are more favorable than the interactions that would be obtained if it was replaced 
by any of the 1 9 other naturally-occurring amino acids); 

* 

25 (h) for each residue identified in step (g) as not making favorable 

interactions with the atoms that lie around it, mutating the residue to another 
amino acid that could make better interactions with the atoms that lie around it, 
thereby promoting the tendency for the test NR polypeptide to fold into a stable 
structure with improved solution properties, less tendency to unfold, and greater 

30 tendency to form ordered crystals; 

(i) examining the three-dimensional model graphically at each residue 
position where the amino acid in the test NR polypeptide is different from the 
amino acid at the corresponding position in the reference NR polypeptide, and 
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checking whether the steric packing, hydrogen bonding and other energetic 
interactions could be improved by mutating that residue or any one or more of the 
surrounding residues lying within 8 angstroms in the three-dimensional model; 

0) for each residue position identified in step (i) as potentially allowing an 
5 improvement in the packing, hydrogen bonding and energetic interactions, 
mutating those residues individually or in combination to residues that could 
improve the packing, hydrogen bonding and energetic interactions, thereby 
promoting the tendency for the test NR polypeptide to fold into a stable structure 
with improved solution properties, less tendency to unfold, and greater tendency 
10 to form ordered crystals. 

By the term "graphically" it is meant through the use of computer aided 
graphics, such by the use of a software package disclosed herein above. 
Optionally, in this embodiment, the reference NR polypeptide is AR, or preferably 
PR, when the test NR polypeptide is GRa. Alternatively, the reference NR 
polypeptide is GRa, and the test NR polypeptide is GRB or MR. 

An isolated GR polypeptide comprising a mutation in a ligand binding 
domain, wherein the mutation alters the solubility of the ligand binding domain, is 
also disclosed. An isolated GR polypeptide, or functional portion thereof, having 
one or more mutations comprising a substitution of a hydrophobic amino acid 
residue by a hydrophilic amino acid residue in a ligand binding domain is also 
disclosed. Preferably, in each case, the mutation can be at a residue selected 
from the group consisting of V552, W557, F602, L636, Y648, W712, L741, L535, 
V538, C638, M691, V702, Y648, Y660, L685, M691, V702, W712, L733, Y764 
and combinations thereof. More preferably, the mutation is selected from the 
group consisting of V552K, W557S, F602S, F602D, F602E, F602Y, F602T, 
F602N. F602C, L636E, Y648Q, W712S, L741R, L535T, V538S, C638S, M691T,' 
V702T, W712T and combinations thereof. Even more preferably, the mutation is 
made by targeted point or randomizing mutagenesis. Hydrophobicity and 
hyrdrophilicity criteria and comparision information are set forth herein below. 

As discussed above, the GRa gene can be translated from its mRNA by 
alternative initiation from an internal ATG codon (Yudt & Cidlowski (2001) Molec. 
Endocrinol. 15: 1093-1 103). This codon codes for methionine at position 27 and 
translation from this position produces a slightly smaller protein. These two 
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isoforms, translated from the same gene, are referred to as GR-A and GR-B. It 
has been shown in a cellular system that the shorter GR-B form is more effective 

» 

in initiating transcription from a GRE compared to GR-A. Additionally, another 
form of GR, called GRp is produced by an alternative splicing event. The GRp 
5 protein differs from GRa at the very C-terminus, where the final 50 amino acids 
are replaced with a 15 amino acid segment. These two isoforms are 100% 
identical up to amino acid 727. No sequence similarity exists between GRa and 
GRp at the C-terminus beyond position 727. GRp has been shown to be a 
dominant negative regulator of GRa-mediated gene transcription (Oakley, Sar & 

10 Cidlowski (1996) J. Biol. Chem. 271: 9550-9559). It has been suggested that 
some of the tissue specific effects observed with glucocorticoid treatment may in 
part be due to the presence of varying amounts of isoform in certain cell-types. 
This method is also applicable to any other subfamily so organized. Thus, while 
the amino acid residue numbers referenced above pertain to GR-A, the 

15 polypeptides of the present invention also have a mutation at an analogous 
position in any polypeptide based on a sequence alignment (such as prepared by 
BLAST or other approach disclosed herein or known in the art) to GRa, which are 
not forth herein for convenience. 

As used in the following discussion, the terms "engineered NR, SR or GR", 

20 "engineered NR, SR or GR LDB", "NR, SR or GR mutant", and "NR, SR or GR 
LBD mutant" refers to polypeptides having amino acid sequences that contain at 
least one mutation in the wild-type sequence, including at an analogous position 
in any polypeptide based on a sequence alignment to GRa. The terms also refer 
to NR, SR or GR and NR, SR or GR LBD polypeptides which are capable of 

25 exerting a biological effect in that they comprise all or a part of the amino acid 
sequence of an engineered mutant polypeptide of the present invention, or cross- 
react with antibodies raised against an engineered mutant polypeptide, or retain 
all or some or an enhanced degree of the biological activity of the engineered 
mutant amino acid sequence or protein. Such biological activity can include the 

30 binding of small molecules in general, the binding of glucocorticoids in particular 
and even more particularly the binding of dexamethasone. 

The terms "engineered NR, SR or GR LBD" and "NR, SR or GR LBD 
mutant" also includes analogs of an engineered NR, SR or GR polypeptide or NR, 
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no need for an engmeered mutant polypeptide to comprise a,, or substantially all 
of he an.no acd sequence of the wild type polypeptide (e.g. SEQ ID NOs:2 or 

shorter " '° n9er SeQUenCeS " 10 ba of use in •» Mention; 

shorter sequences are herein referred to as -segments-. Thus, the terms 

engineered NR. SR or GR LBD" and -NR. SR or GR LBD mutant" a,so includes 
toon, ch,menc or recombinant engineered NR. SR or GR LBD or NR. SR or GR 
LBD mutant poiypeptides and proteins comprising sequences of the present 
mvenhon. Methods of preparing such proteins ana disclosed herein above. 

*J1 Sequence Si milarity and Mania 

As used herein, the tern, -substantially similar as applied to GR means 
that a parhcular sequence varies from nucleic acid sequence of any of odd 
numbered SEQ ID NOs:1-, 5 , or the amino ac W sequence of any of even 
numbered SEQ ,D NOs: 2 - 16 by one or more deletions. substitutions, or addons 
the net effect of which is ,o retain at leas, some of biological activity of the natural' 
gene, gene pmduct, or sequence. Such sequences include -mutant" or 
Pdymorphic- sequences, or sequences in which the biological activity and/or the 
Physical properties are altered to some degree but retains at least some or an 
enhanced degree of the original biological actMy and/or phyaica. properties In 
determming nucleic acid sequences, all subject nucleic acid sequences capable of 
encoding substantially similar amino acid sequences are considered to be 
substantelly simiiar to a reference nucteic ackt sequence. rega rate ss o, 
differences in codon sequences or substitution of equivalent amino acids to create 
biologically functional equivalents. 
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X.D.1. Sequences That are Substantially Identical to an Engineered 

NR, SR or GR or NR, SR or GR LBD Mutant Sequence of the 
Present Invention 

Nucleic acids that are substantially identical to a nucleic acid sequence of 
5 an engineered NR, SR or GR or NR, SR or GR LBD mutant of the present 
invention, e.g. allelic variants, genetically altered versions of the gene, etc:, bind to 
an engineered NR, SR or GR or NR, SR or GR LBD mutant sequence under 
stringent hybridization conditions. By using probes, particularly labeled probes of 
DNA sequences, one can isolate homologous or related genes. The source of 

10 homologous genes can be any species, e.g. primate species; rodents, such as 
rats and mice, canines, felines, bovines, equines, yeast, nematodes, etc. 

Between mammalian species, e.g. human and mouse, homologs have 
substantial sequence similarity, i.e. at least 75% sequence identity between 
nucleotide sequences. Sequence similarity is calculated based on a reference 

15 sequence, which can be a subset of a larger sequence, such as a conserved 
motif, coding region, flanking region, etc. A reference sequence will usually be at 
least about 18 nt long, more usually at least about 30 nt long, and can extend to 
the complete sequence that is being compared. Algorithms for sequence analysis 
are known in the art, such as BLAST, described in Altschul et al. , (1990) J. Mot, 

20 BioL 215: 403-10. Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Information 
(http://www.ncbi.nlm.nih.gov/). 

This algorithm involves first identifying high scoring sequence pairs (HSPs) 
by identifying short words of length W in the query sequence, which either match 

25 or satisfy some positive-valued threshold score T when aligned with a word of the 
same length in a database sequence. T is referred to as the neighborhood word 
score threshold. These initial neighborhood word hits act as seeds for initiating 
searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score 

30 can be increased. Cumulative scores are calculated using, for nucleotide 
sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For 
amino acid sequences, a scoring matrix is used to calculate the cumulative score. 



10 



15 



20 



25 



30 



WO 03/015692 

PCT/US02/22648 

-87- 

Extension of the word hits in each direction are halted when the cumulative 
alignment score falls off by the quantity X from its maximum achieved value the 
cumulative score goes to zero or below due to the accumulation of one or more 
negat.ve-scoring residue alignments, or the end of either sequence is reached 
The BLAST algorithm parameters W, T. and X determine the sensitivity and speed 
of the alignment. The BLASTN program (for nucleotide sequences) uses as 
defaults a wordlength W=11, an expectation E=10, a cutoff of 100, M=5, N=-4 
and a comparison of both strands. For amino acid sequences, the BLASTP 
program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the 
BLOSUM62 scoring matrix. See Henikoff & Henikoff . (1989) Proc Natl Acad Sci 
U.S.A. 89: 10915. 

In addition to calculating percent sequence identity, the BLAST algorithm 
also performs a statistical analysis of the similarity between two sequences. See 
e.g.. Karlin and Altschul, (1993) Proc Natl Acad Sci U.S.A. 90: 5873-5887. One 
measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For 
example, a test nucleic acid sequence is considered similar to a reference 
sequence if the smallest sum probability in a comparison of the test nucleic acid 
sequence to the reference nucleic acid sequence is less than about 0.1, more 
preferably less than about 0.01, and most preferably less than about 0.001. 

Percent identity or percent similarity of a DNA or peptide sequence can be 
determined, for example, by comparing sequence information using the GAP 
computer program, available from the University of Wisconsin Geneticist 
Computer Group. The GAP program utilizes the alignment method of Needleman 
et_al, (1970) J. Mol. Biol. 48: 443, as revised by Smith et al. . (1981) Adv. Appl. 
Math. 2:482. Briefly, the GAP program defines similarity as the number of aligned 
symbols (i.e., nucleotides or amino acids) which are similar, divided by the total 
number of symbols in the shorter of the two sequences. The preferred 
parameters for the GAP program are the default parameters, which do not impose 
a penalty for end gaps. See, e.g., Schwartz et al. . eds., (1979), Atlas of Protein 
Sequence and Structure. National Biomedical Research Foundation , pp. 357-358, 
and Gribskov et al.. (1 986) Nucl. Acids. Res. 14: 6745. 
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The term "similarity" is contrasted with the term "identity". Similarity is 
defined as above; "identity", however, means a nucleic acid or amino acid 
sequence having the same amino acid at the same relative position in a given 
family member of a gene family. Homology and similarity are generally viewed as 
5 broader terms than the term identity. Biochemically similar amino acids, for 
example leucine/isoleucine or glutamate/aspartate, can be present at the same 
position— these are not identical per se, but are biochemically "similar." As 
disclosed herein, these are referred to as conservative differences or conservative 
substitutions. This differs from a conservative mutation at the DNA level, which 

1 0 changes the nucleotide sequence without making a change in the encoded amino 
acid, e.g. TCC to TCA, both of which encode serine. 

As used herein, DNA analog sequences are "substantially identical" to 
specific DNA sequences disclosed herein if: (a) the DNA analog sequence is 
derived from coding regions of the nucleic acid sequence shown in any one of odd 

15 numbered SEQ ID NOs:1-15 or (b) the DNA analog sequence is capable of 
hybridization with DNA sequences of (a) under stringent conditions and which 
encode a biologically active GRa or GRa LBD gene product; or (c) the DNA 
sequences are degenerate as a result of alternative genetic code to the DNA 
analog sequences defined in (a) and/or (b). Substantially identical analog proteins 

20 and nucleic acids will have between about 70% and 80%, preferably between 
about 81% to about 90% or even more preferably between about 91% and 99% 
sequence identity with the corresponding sequence of the native protein or nucleic 
acid. Sequences having lesser degrees of identity but comparable biological 
activity are considered to be equivalents. 

25 As used herein, "stringent conditions" means conditions of high stringency, 

for example 6X SSC, 0.2% polyvinylpyrrolidone, 0.2% Ficoll, 0.2% bovine serum 
albumin, 0.1% sodium dodecyl sulfate, 100 ng/ml salmon sperm DNA and 15% 
formamide at 68°C. For the purposes of specifying additional conditions of high 
stringency, preferred conditions are salt concentration of about 200 mM and 

30 temperature of about 45°C. One example of such stringent conditions is 
hybridization at 4X SSC, at 65°C, followed by a washing in 0.1XSSC at 65°C for 
one hour. Another exemplary stringent hybridization scheme uses 50% 
formamide, 4X SSC at 42°C. 
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hv h vh " > , COn ' raS1, nUCle ' C aCldS haV ' n9 SevmKe similaril V a « detected by 
hybndrzabon under lower stringency conditions. Thus, sequence identity can be 

determrned by hybridization under iower stringency condihons. for exampie, a, 
50 C or h,gher and 0.1X SSC (g mM NaCI/0.9 mM sodium citrate) and the 
sequences will remain bound when subjected to washing at 55-C in 1X SSC 

As used herein, the term "complementary sequences' means nucleic acid 
sequences that are base^aired according to the standard Watson-Crick 
complementarity rules. The present invent aiso encompasses the use of 

10 • vi„:r se9mente ™ are m * m *« ,o - *~ - - <"—« 

Hybridization can also be used for assessing complementary sequences 
andtor ,so,at,ng complementer nucleotide sequences. As discussed above 
nude,c acid hybridization will be affected by such conditons as saK concentration' 
tomperatore. or organic solvents, in addition to me base composiaon. length of the 

r h rr ^ and *• ^ ° f ^ 

the hybnd E ,ng nucleic acids, as will be readily appreciated by those skilled in the 
art stnngent temperature conditions will generally include temperatures in 
excess of about arc. typically in excess o, about 37-C, and preferably in excess 
of about 45 C. Stringent salt conditions will ordinarily be less than about 1 000 
mM. typically lees than about 500 mM. and preferably less man about 200 mM 
However, the combination of parameters is much more important than the 
measure of any single parameter. See e.g., Welraur S. David.,™ , m 
iol 31: 349-70. Determining appropriate hybridization conditions to identify 
and/or .solate sequences containing high levels of homology is well known in me 
^ Seeeg^. Sambraoketal, (1989) Molecul arCloning: A Laboratory Ma n„„ 
Cold Spring Harbor, New York. 

Factional Equivalents of an F leered NR. SR nr QR ^ 
NR, SR, GR LBD Mutant N„r.i e ic Acid Seg ugnre »f fh- 
0 Present Invention 

As used herein, the term "functionally equivalent codon" is used to refer to 
codons that encode the same amino acid, such as the ACQ and AGU codons for 
senne. For example, GRcc or GRa LBD-encoding nucleic acid sequences 
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comprising any one of odd numbered SEQ ID NOs:1-15, which have functionally 
equivalent codons are covered by the present invention. Thus, when referring to 
the sequence example presented in odd numbered SEQ ID NOs:1-15 f applicants 
provide substitution of functionally equivalent codons into the sequence example 
5 of in odd numbered SEQ ID NOs:1-15. Thus, applicants are in possession of 
amino acid and nucleic acids sequences which include such substitutions but 
which are not set forth herein in their entirety for convenience. 

It will also be understood by those of skill in the art that amino acid and 
nucleic acid sequences can include additional residues, such as additional N- or 

10 C-terminal amino acids or 5' or 3' nucleic acid sequences, and yet still be 
essentially as set forth in one of the sequences disclosed herein, so long as the 
sequence retains biological protein activity where polypeptide expression is 
concerned. The addition of terminal sequences particularly applies to nucleic acid 
sequences which can, for example, include various non-coding sequences 

15 flanking either of the 5' or 3' portions of the coding region or can include various 
internal sequences, i.e., introns, which are known to occur within genes. 

X.D.3. Biological Equivalents 

The present invention envisions and includes biological equivalents of a 
20 engineered NR, SR or GR or NR, SR or GR LBD mutant polypeptide of the 
present invention. The term "biological equivalent" refers to proteins having amino 
acid sequences which are substantially identical to the amino acid sequence of an 
engineered NR, SR or GR LBD mutant of the present invention and which are 
capable of exerting a biological effect in that they are capable of binding small 
25 molecules or cross-reacting with anti- NR, SR or GR or NR, SR or GR LBD 
mutant antibodies raised against an engineered mutant NR, SR or GR or NR, SR 
or GR LBD polypeptide of the present invention. 

For example, certain amino acids can be substituted for other amino acids 
in a protein structure without appreciable loss of interactive capacity with, for 
30 example, structures in the nucleus of a cell. Since it is the interactive capacity and 
nature of a protein that defines that protein's biological functional activity, certain 
amino acid sequence substitutions can be made in a protein sequence (or the 
nucleic acid sequence encoding it) to obtain a protein with the same, enhanced, or 
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antagonistic properties. Such properties can be achieved by interaction with the 
normal targets of the protein, but this need not be the case, and the biological 
act.v,ty of the invention is not limited to a particular mechanism of action. It is thus 
•n accordance with the present invention that various changes can be made in the 
■ am.no acid sequence of an engineered NR, SR or GR or NR, SR or GR LBD 
mutant polypeptide of the present invention or its underlying nucleic acid 
sequence without appreciable loss of biological utility or activity 

Biologically equivalent polypeptides, as used herein, are polypeptides in 
wh,ch certain, but not most or a.., of the amino acids can be substituted. Thus 
when referring to the sequence examples presented in any of even numbered 
SEQ ID NOs:2-16, applicants envision substitution of codons that encode 
biologically equivalent amino acids, as described herein, into a sequence example 
of even numbered SEQ ID NOs: 2-16. respectively. Thus, applicants are in 
possession of amino acid and nucleic acids sequences which include such 
substitutions but which are not set forth herein in their entirety for convenience 

Alternatively, functionally equivalent proteins or peptides can be created via 
the application of recombinant DNA technology, in which changes in the protein 
structure can be engineered, based on considerations of the properties of the 
ammo acids being exchanged, e.g. substitution of He for Leu. Changes designed 
by man can be introduced through the application of site-directed mutagenesis 
techniques, e.g., to introduce improvements to the antigenicity of the protein or to 
test an engineered mutant polypeptide of the present invention in order to 
modulate lipid-binding or other activity, at the molecular level. 

Amino acid substitutions, such as those which might be employed in 
mod.fy.ng an engineered mutant polypeptide of the present invention are 
generally, but not necessarily, based on the relative similarity of the amino acid 
s.de-cha,n substituents, for example, their hydrophobic^, hydrophilicity, charge 
s.ze, and the like. An analysis of the size, shape and type of the amino acid side- 
cha.n substituents reveals that arginine, lysine and histidine are all positively 
charged residues; that alanine, glycine and serine are all of similar size; and that 
Phenylalanine, tryptophan and tyrosine all have a generally similar shape 
Therefore, based upon these considerations, arginine, lysine and histidine- 
alanme, glycine and serine; and phenylalanine, tryptophan and tyrosine; are 
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defined herein as biologically functional equivalents. Those of skill in the art will 
appreciate other biologically functionally equivalent changes. It is implicit in the 
above discussion, however, that one of skill in the art can appreciate that a 
radical, rather than a conservative substitution is warranted in a given situation. 
5 Non-conservative substitutions in engineered mutant LBD polypeptides of the 
present invention are also an aspect of the present invention. 

In making biologically functional equivalent amino acid substitutions, the 
hydropathic index of amino acids can be considered. Each amino acid has been 
assigned a hydropathic index on the basis of their hydrophobicity and charge 

10 characteristics, these are: isoleucine (+ 4.5); valine (+ 4.2); leucine (+ 3.8); 
phenylalanine (+ 2.8); cysteine (+ 2.5); methionine (+ 1.9); alanine (+ 1.8); glycine 
(-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline (- 
1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); 
asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

15 The importance of the hydropathic amino acid index in conferring 

interactive biological function on a protein is generally understood in the art ( Kyte 
& Doolittle , (1982), J. Mol. Biol 157: 105-132, incorporated herein by reference). 
It is known that certain amino acids can be substituted for other amino acids 
having a similar hydropathic index or score and still retain a similar biological 

20 activity. In making changes based upon the hydropathic index, the substitution of 
amino acids whose hydropathic indices are within ±2 of the original value is 
preferred, those which are within ±1 of the original value are particularly preferred, 
and those within ±0.5 of the original value are even more particularly preferred. 

It is also understood in the art that the substitution of like amino acids can 

25 be made effectively on the basis of hydrophilicity. U.S. Patent No. 4,554,101, 
incorporated herein by reference, states that the greatest local average 
hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino 
acids, correlates with its immunogenicity and antigenicity, i.e. with a biological 
property of the protein. It is understood that an amino acid can be substituted for 

30 another having a similar hydrophilicity value and still obtain a biologically 
equivalent protein. 

As detailed in U.S. Patent No. 4,554,101, the following hydrophilicity values 
have been assigned to amino acid residues: arginine (+ 3.0); lysine (+ 3.0); 
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aspartate (+ 3.0±1); glutamate (+ 3.0±1); serine (+ 0.3); asparagine (+ 0.2); 
glutamine (+ 0.2); glycine (0); threonine (-0.4); proline (-0.5±1); alanine (-0.5); 
histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); 
isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

In making changes based upon similar hydrophilicity values, the 
substitution of amino acids whose hydrophilicity values are within ±2 of the original 
value is preferred, those which are within ±1 of the original value are particularly 
preferred, and those within ±0.5 of the original value are even more particularly 
preferred. 

While discussion has focused on functionally equivalent polypeptides 
arising from amino acid changes, it will be appreciated that these changes can be 
effected by alteration of the encoding DNA, taking into consideration also that the 
genetic code is degenerate and that two or more codons can code for the same 
amino acid. 

Thus, it will also be understood that this invention is not limited to the 
particular amino acid and nucleic acid sequences of any of SEQ ID NOs:1-16. 
Recombinant vectors and isolated DNA segments can therefore variously include 
an engineered NR, SR or GR or NR, SR or GR LBD mutant polypeptide-encoding 
region itself, include coding regions bearing selected alterations or modifications 
in the basic coding region, or include larger polypeptides which nevertheless 
comprise an NR, SR or GR or NR, SR or GR LBD mutant polypeptide-encoding 
regions or can encode biologically functional equivalent proteins or polypeptides 
which have variant amino acid sequences. Biological activity of an engineered 
NR, SR or GR or NR, SR or GR LBD mutant polypeptide can be determined, for 
i example, by transcription assays known to those of skill in the art. 

The nucleic acid segments of the present invention, regardless of the 
length of the coding sequence itself, can be combined with other DNA sequences, 
such as promoters, enhancers, polyadenylation signals, additional restriction 
enzyme sites, multiple cloning sites, other coding segments, and the like, such 
that their overall length can vary considerably. It is therefore contemplated that a 
nucleic acid fragment of almost any length can be employed, with the total length 
preferably being limited by the ease of preparation and use in the intended 
recombinant DNA protocol. For example, nucleic acid fragments can be prepared 
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which include a short stretch complementary to a nucleic acid sequence set forth 
in any of odd numbered SEQ ID NOs:1-15, such as about 10 nucleotides, and 
which are up to 10,000 or 5,000 base pairs in length. DNA segments with total 
lengths of about 4,000, 3,000, 2,000, 1,000, 500, 200, 100, and about 50 base 
5 pairs in length are also useful. 

The DNA segments of the present invention encompass biologically 
functional equivalents of engineered NR, SR or GR, or NR, SR or GR LBD mutant 
polypeptides. Such sequences can rise as a consequence of codon redundancy 
and functional equivalency that are known to occur naturally within nucleic acid 

10 sequences and the proteins thus encoded. Alternatively, functionally equivalent 
proteins or polypeptides can be created via the application of recombinant DNA 
technology, in which changes in the protein structure can be engineered, based 
on considerations of the properties of the amino acids being exchanged. 
Changes can be introduced through the application of site-directed mutagenesis 

1 5 techniques, e.g., to introduce improvements to the antigenicity of the protein or to 
test variants of an engineered mutant of the present invention in order to examine 
the degree of binding activity, or other activity at the molecular level. Various site- 
directed mutagenesis techniques are known to those of skill in the art and can be 
employed in the present invention. 

20 The invention further encompasses fusion proteins and peptides wherein 

an engineered mutant coding region of the present invention is aligned within the 
same expression unit with other proteins or peptides having desired functions, 
such as for purification or immunodetection purposes. 

Recombinant vectors form important further aspects of the present 

25 invention. Particularly useful vectors are those in which the coding portion of the 
DNA segment is positioned under the control of a promoter. The promoter can be 
that naturally associated with an NR, SR or GR gene, as can be obtained by ' 
isolating the 5' non-coding sequences located upstream of the coding segment or 
exon, for example, using recombinant cloning and/or PCR technology and/or other 

30 methods known in the art, in conjunction with the compositions disclosed herein. 

In other embodiments, certain advantages will be gained by positioning the 
coding DNA segment under the control of a recombinant, or heterologous, 
promoter. As used herein, a recombinant or heterologous promoter is a promoter 
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that is not normally associated with an NR, SR or GR gene in its natural 
environment. Such promoters can include promoters isolated from bacterial, viral, 
eukaryotic. or mammalian cells. Naturally, it will be important to employ a 
promoter that effectively directs the expression of the DMA segment in the cell 
type chosen for expression. The use of promoter and cel. type combinations for 
prote.n expression is generally known to those of skill in the art of molecular 
b.ology (See^, Sambrook et al, (1989) Molecular Cloning: A I ahn^ry 
Manual, Cold Spring Harbor Laboratory, New York, specifically incorporated 
herem by reference). The promoters employed can be constitutive or inducible 
and can be used under the appropriate conditions to direct high level expression 
of the introduced DNA segment, such as is advantageous in the large-scale 
production of recombinant proteins or peptides. One preferred promoter system 
contemplated for use in high-level expression is a T7 promoter-based system 



^ Antibodies to an Engineered NR. SR or GR or NR, SR. GR LBD 

Mutant Polypeptide nf th e Present Invention 
The present invention also provides an antibody that specifically binds a 
engineered NR, SR or GR or NR, SR, GR LBD mutant polypeptide and methods 
to generate same. The term "antibody" indicates an immunoglobulin protein or 
functional portion thereof, including a polyclonal antibody, a monoclonal antibody 
a chimeric antibody, a single chain antibody, Fab fragments, and a Fab 
expression library. "Functional portion" refers to the part of the protein that binds 
a molecule of interest. In a preferred embodiment, an antibody of the invention is 
a monoclonal antibody. Techniques for preparing and characterizing antibodies 
are well known in the art (See, e.g., Harlow & Lane (1988) Antibodies: A 
Laboratory Manua l, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
New York). A monoclonal antibody of the present invention can be readily 
prepared through use of well-known techniques such as the hybridoma 
techniques exemplified in U.S. Patent No 4,196,265 and the phage-displayed 
30 techniques disclosed in U.S. Patent No. 5,260,203. 

The phrase "specifically (or selectively) binds to an antibody", or 
"specifically (or selectively) immunoreactive with", when referring to a protein or 
peptide, refers to a binding reaction which is determinative of the presence of the 
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protein in a heterogeneous population of proteins and other biological materials. 
Thus, under designated immunoassay conditions, the specified antibodies bind to 
a particular protein and do not show significant binding to other proteins present in 
the sample. Specific binding to an antibody under such conditions can require an 
5 antibody that is selected for its specificity for a particular protein. For example, 
antibodies raised to a protein with an amino acid sequence encoded by any of the 

■ 

nucleic acid sequences of the invention can be selected to obtain antibodies 
specifically immunoreactive with that protein and not with unrelated proteins. 

The use of a molecular cloning approach to generate antibodies, 

10 particularly monoclonal antibodies, and more particularly single chain monoclonal 
antibodies, are also provided. The production of single chain antibodies has been 
described in the art. See , e.g., U.S. Patent No. 5,260,203. For this approach, 
combinatorial immunoglobulin phagemid libraries are prepared from RNA isolated 
from the spleen of the immunized animal, and phagemids expressing appropriate 

15 antibodies are selected by panning on endothelial tissue. The advantages of this 
approach over conventional hybridoma techniques are that approximately 10 4 
times as many antibodies can be produced and screened in a single round, and 
that new specificities are generated by heavy (H) and light (L) chain combinations 
in a single chain, which further increases the chance of finding appropriate 

20 antibodies. Thus, an antibody of the present invention, or a "derivative" of an 
antibody of the present invention, pertains to a single polypeptide chain binding 
molecule which has binding specificity and affinity substantially similar to the 
binding specificity and affinity of the light and heavy chain aggregate variable 
region of an antibody described herein. 

25 The term "immunochemical reaction", as used herein, refers to any of a 

variety of immunoassay formats used to detect antibodies specifically bound to a 
particular protein, including but not limited to competitive and non-competitive 
assay systems using techniques such as radioimmunoassays, ELISA (enzyme 
linked immunosorbent assay), "sandwich" immunoassays, immunoradiometric 

30 assays, gel diffusion precipitation reactions, immunodiffusion assays, in situ 
immunoassays (e.g., using colloidal gold, enzyme or radioisotope labels), western 
blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, 
hemagglutination assays), complement fixation assays, immunofluorescence 
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assays, protein A assays, and Immunoelectrophoresis assays, etc. See Harlow & 
Lane (1988) for a description of immunoassay formats and conditions. 

^ Method for Detecting an Engin eered NR. SR or GR or NR. SR. GR 

LBD Mutant Polypeptide or an Nucleic Acid Molecule Encoding the 
Same 

In another aspect of the invention, a method is provided for detecting a 
level of an engineered NR, SR or GR or NR, SR, GR LBD mutant polypeptide 
us.ng an antibody that specifically recognizes an engineered NR, SR or GR or 
NR, SR, GR LBD mutant polypeptide, or portion thereof. In a preferred 
embodiment, biological samples from an experimental subject and a control 
subject are obtained, and an engineered NR. SR or GR or NR, SR, GR LBD 
mutant polypeptide is detected in each sample by immunochemical reaction with 
the antibody. More preferably, the antibody recognizes amino acids of any one of 
the even-numbered SEQ ID NOs:4, 6, 8, 12, 14, and 16, and is prepared 
according to a method of the present invention for producing such an antibody. 

In one embodiment, an antibody is used to screen a biological sample for 
the presence of an engineered NR, SR or GR or NR, SR, GR LBD mutant 
polypeptide. A biological sample to be screened can be a biological fluid such as 
extracellular or intracellular fluid, or a cell or tissue extract or homogenate. A 
biological sample can also be an isolated cell (e.g., in culture) or a collection of 
cells such as in a tissue sample or histology sample. A tissue sample can be 
suspended in a liquid medium or fixed onto a solid support such as a microscope 
slide. In accordance with a screening assay method, a biological sample is 
exposed to an antibody immunoreactive with an engineered NR, SR or GR or NR, 
SR, GR LBD mutant polypeptide whose presence is being assayed, and the 
formation of antibody-polypeptide complexes is detected. Techniques for 
detecting such antibody-antigen conjugates or complexes are well known in the 
art and include but are not limited to centrifugation, affinity chromatography and 
the like, and binding of a labeled secondary antibody to the antibody-candidate 
receptor complex. 

In another aspect of the invention, a method is provided for detecting a 
nucleic acid molecule that encodes an engineered NR, SR or GR or NR, SR, GR 
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LBD mutant polypeptide. According to the method, a biological sample having 
nucleic acid material is procured and hybridized under stringent hybridization 
conditions to an engineered NR, SR or GR or NR, SR, GR LBD mutant 
polypeptide-encoding nucleic acid molecule of the present invention. Such 
5 hybridization enables a nucleic acid molecule of the biological sample and an 
engineered NR, SR or GR or NR, SR, GR LBD mutant polypeptide encoding- 
nucleic acid molecule to form a detectable duplex structure. Preferably, the an 
engineered NR, SR or GR or NR, SR, GR LBD mutant polypeptide encoding- 
nucleic acid molecule includes some or all nucleotides of any one of the odd- 
10 numbered SEQ ID NOs:3, 5, 7, 11, 13, and 15. Also preferably, the biological 
sample comprises human nucleic acid material. 

XI. The Role of the Three-Dimensional Structure of the GRa LDB in Solving 
Additional NR, SR or GR Crystals 

15 Because polypeptides can crystallize in more than one crystal form, the 

structural coordinates of a GRa LBD, or portions thereof, as provided by the 
present invention, are particularly useful in solving the structure of other crystal 
forms of GRa and the crystalline forms of other NRs, SRs and GRs. The 
coordinates provided in the present invention can also be used to solve the 

20 structure of NR, SR or GR and NR, SR or GR LBD mutants (such as those 
described in Sections IX and X above), NR, SR or GR LDB co-complexes, or of 
the crystalline form of any other protein with significant amino acid sequence 
homology to any functional domain of NR, SR or GR. 

25 XI.A. Determining the Three-Dimensional Structure of a Polypeptide Using 

the Three-Dimensional Structure of the GRa LBD as a Template in 
Molecular Replacement 
One method that can be employed for the purpose of solving additional GR 
crystal structures is molecular replacement. See generally , Rossmann , ed, (1972) 
30 The Molecular Replacement Method , Gordon & Breach, New York. In the 
molecular replacement method, the unknown crystal structure, whether it is 
another crystal form of a GRa or a GRa LBD, (i.e. a GRa or a GRa LBD mutant), 
or an NR, SR or GR or an NR, SR or GR LBD polypeptide complexed with 
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another compound (a "co-complex"), or the crystal of some other protein with 
significant amino acid sequence homology to any functional region of the GRa 
LBD, can be determined using the GRa LBD structure coordinates provided in 
Table 4. This method provides an accurate structural form for the unknown 
crystal more quickly and efficiently than attempting to determine such information 
ab initio. 

In addition, in accordance with this invention, NR, SR or GR and NR, SR or 
GR LBD mutants can be crystallized in complex with known modulators. The 
crystal structures of a series of such complexes can then be solved by molecular 
replacement and compared with that of the wild-type NR, SR or GR or the wild- 
type NR, SR or GR LBD. Potential sites for modification within the various 
binding sites of the enzyme can thus be identified. This information provides an 
additional tool for determining the most efficient binding interactions, for example, 
increased hydrophobic interactions, between the GRa LBD and a chemical entity 
15 or compound. 

All of the complexes referred to in the present disclosure can be studied 
using X-ray diffraction techniques ( See, e.g. . Blundell & Johnson (1985) 
Method.Enzymol., 114A & 115B, (Wyckoff etaL, eds.), Academic Press; McRee, 
(1993) Practical Protein Crystallography. Academic Press, New York) and can be 
refined using computer software, such as the X-PLOR™ program (Brunger , 
(1992) X-PLOR, Version 3.1. A System for X-ray Crystallography and NMR, Yale 
University Press, New Haven, Connecticut; X-PLOR is available from Molecular 
Simulations, Inc., San Diego, California) and the XTAL-VIEW program (McRee , 
(1992) J. Mol. Graphics 10: 44-46; McRee , (1993) Practical Protein 
Crystallography , Academic Press, San Diego, California). This information can 
thus be used to optimize known classes of GR and GR LBD modulators, and 
more importantly, to design and synthesize novel classes of GR and GR LBD 
modulators. 

30 Laboratory Examples 

The following Laboratory Examples have been included to illustrate 
preferred modes of the invention. Certain aspects of the following Laboratory 
Examples are described in terms of techniques and procedures found or 
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contemplated by the present inventors to work well in the practice of the invention. 
These Laboratory Examples are exemplified through the use of standard 
laboratory practices of the inventors. In light of the present disclosure and the 
general level of skill in the art, those of skill will appreciate that the following 
5 Laboratory Examples are intended to be exemplary only and -that numerous 
changes, modifications and alterations can be employed without departing from 
the spirit and scope of the invention. 

Example 1 

10 Construction of the Modified pET24 Expression Vector 

The expression vector pGEX-2T (Amersham Pharmacia Biotech, 
Piscataway, New Jersey) was used as a template in a polymerase chain reaction 
to engineer a polyhistidine tag in frame to the sequence encoding glutathione S- 
transferase (GST) and a thrombin protease site. The forward primer contained a 

15 Nde I site (5' CGG CGG CGC CAT ATG AAA AAA GGT (CAT ) 6 GGT TCC CCT 
ATA CTA GGT TAT TGG A 3') (SEQ ID NO: 19) and the reverse primer (5* CGG 
CGG CGC GGA TCC ACG CGG AAC CAG ATC CGA 3') (SEQ ID NO:20) 
contained a BamH I site which allowed for direct cloning of the amplfied product 
into pET24a (Novagen, Inc., Madison, Wisconsin) following restiction enzyme 

20 digestion. The resulting sequence of the modified GST (SEQ ID NO:21)(last six 
residues are thrombin protease site) is-below: 

MKKGHHHHHH HGSPILGYWK IKGLVQPTRL LLEYLEEKYE EHLYERDEGD 50 
KWRNKKFELG LEFPNLPYYI DGDVKLTQSM AIIRYIADKH NMLGGCPKER 100 
AEISMLEGAV LDIRYGVSRI AYSKDFETLK VDFLSKLPEM LKMFEDRLCH 150 
25 KTYLNGDHVT HPDFMLYDAL DVVLYMDPMC LDAFPKLVCF KKRIEAIPQI 200 
DKYLKSSKYI AWPLQGWQAT FGGGDHPPKS DLVPRGS 237 

Example 2 

30 Mutagenesis (F602S AND F602D) of Human GR Ligand Binding Domain (LBD) 

Two complimentary oligonucleotides for each desired mutation were 
constructed. The following sequences represent the oligonucleotides for the 
Phenylalanine 602 Serine mutation: 
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Forward Primer (F602S) (SEQ ID NO:22): 

5' TAC TCC TGG ATG TCC CTT ATG GCA TTT GCT CT 3' 

Reverse Primer (F602S) (SEQ ID NO:23): 

5' AG AGC AAA TGC CAT AAG GGA CAT CCA GGA GTA 3' 

Another separate mutation was also constructed. The sequences below 
represent the oligonucleotides for the Phenylalanine 602 Aspartic Acid mutation: 

Forward Primer (F602D) (SEQ ID NO:24): 

5' TAC TCC TGG ATG GAC CTT ATG GCA TTT GCT CT 3' 

Reverse Primer (F602D) (SEQ ID NO:25): 

5' AG AGC AAA TGC CAT AAG GJC CAT CCA GGA GTA 3' 

The underlined letters depict the base changes from the wild type human 
GR sequence. The GR LBD (amino acids 521-777) (SEQ ID NOs-9-10) 
previously cloned into the pRSET A vector (Invitrogen of Carlsbad, California) was 
used as the backbone to create the mutants. The procedure used to make the 
mutation is outlined in the QuickChange Site-Directed Mutagenesis Kit sold by 
Stratagene, La Jolla, California (Catalog # 200518). After the constructs were 
sequence verified, the mutants of GR-LBD were subcloned inframe with the 
glutathione S-transferase in the modified pET24 expression vector. A thrombin 
protease site at the C-terminus of the glutathione S-transferase allows for 
cleavage of the resultant fusion protein following expression. 

The resulting final amino acid sequences for the mutant GR LBDs are 
below. The underlined, bolded amino acids depict the changes from the wild type 
human GR sequence. 

GR-LBD(521-777) F602S (SEQ ID NO:12) 

VPATLPQLTP TLVSLLEVIE PEVLYAGYDS SVPDSTWRIM TTLNMLGGRQ 
VIAAVKWAKA IPGFRNLHLD DQMTLLQYSW MSLMAFALGW RSYRQSSANL 
LCFAPDLI IN EQRMTLPCMY DQCKHMLYVS SELHRLQVSY EEYLCMKTLL 
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LLSSVPKDGL KSQELFDEIR MTYIKELGKA IVKREGNSSQ NWQRFYQLTK 

LLDSMHEVVE NLLNYCFQTF LDKTMSIEFP EMLAEIITNQ IPKYSNGNIK 
KLLFHQK 



5 

GR-LBD(521-777) F602D (SEQ ID NO: 14) 

VPATLPQLTP TLVSLLEVIE PEVLYAGYDS SVPDSTWRIM TTLNMLGGRQ 
VIAAVKWAKA IPGFRNLHLD DQMTLLQYSW MDLMAFALGW RSYRQSSANL 
10 LCFAPDLIIN EQRMTLPCMY DQCKHMLYVS SELHRLQVSY EEYLCMKTLL 
LLSSVPKDGL KSQELFDEIR MTYIKELGKA IVKREGNSSQ NWQRFYQLTK 
LLDSMHEVVE NLLNYCFQTF LDKTMSIEFP EMLAEIITNQ IPKYSNGNIK 
KLLFHQK 

15 Example 3 

Expression of the Fusion Protein 
BL21(DE3) cells (Novagen, Inc., Madison, Wisconsin) were transformed 
following established protocols. Following overnight incubation at 37°C a single 
colony was used to inoculate a 10 ml LB culture containing 50 jig/ml kanamycin 

20 (Sigma Chemical Company, St. Louis, Missouri). The culture was grown for -12 
hrs at 37°C and then a 500|o,l aliquot was used to inoculate flasks containing 1 liter 
Circle Grow media (Bio101, Inc., now Qbiogene of Carlsbad, California) and the 
required antibiotic. The cells were then grown at 22°C to an OD600 between 1 
and 2 and then cooled to 16°C. Following a 30 min equilibration at that 

25 temperature, dexamethasone (Spectrum, Gardena, California) (10 |iM final 
concentration) was added. Induction of expression was achieved by adding IPTG 
(BACHEM AG, Switzerland) (final concentration 1 mM) to the cultures. 
Expression at 16°C was continued for - 24 hrs. Cells were then harvested and 
frozen at -80°C. 

♦ 

30 Referring now to Figure 1A, E. coli expression of mutant 6xHisGST- 

GR(521-777) F602S is shown. Shown are the pellet (P - insoluble) and eluent (E 

- soluble Ni++ binding) fractions of protein expressed in the absence of ligand (NL 

- lanes 2 and 3) or in the presence (10 micromolar) of dexamethasone (DEX), 
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lanes 4 and 5, or RU486, lanes 6 and 7. The positions of molecular mass (kDa) 
markers M (lane 1) (94, 67, 43, 30, 20 and 14 kDa, respectively) and of the 
expressed protein are indicated to the left and right sides of the panel, 
respectively. 

Referring now to Figure 1B, E coli expression of mutant 6xHisGST- 
GR(521-777) F602D is shown. Shown are eluent fractions from Ni++ chelated 
resin of two separate samples. Protein was expressed in either the presence (+, 
lanes 2 and 4, 10 micromolar) or absence (-, lanes 3 and 5) of dexamethasone. 
The positions of molecular mass (kDa) markers M (lane 1) (94, 67, 43, 30, 20 and 
14 kDa, respectively) and of the expressed protein are indicated to the left and 
right sides of the panel, respectively. 

Example 4 
Purification Of 6R-LBD (F602S) 
~200 g cells were resuspended in 700mL lysis buffer (50mM Tris pH =8.0, 
150 mM NaCI, 2M Urea, 10% glycerol and 100 »M dexamethasone) and lysed by 
passing 3 times through an APV Lab 2000 homogenizes The lysate was 
subjected to centrifugation (45 minutes, 20,000g, 4X). followed by a second 20 
min spin at 20,000 g, 4°. The cleared supernatant was filtered through coarse pre- 
filters and 50 mM Tris, pH= 8.0, containing 150 mM NaCI, 10% glycerol and 1M 
imidazole was added to obtain a final imidazole concentration of 50mM. This 
lysate was loaded onto a XK-26 column (Pharmacia, Peapack, New Jersey) 
packed with SEPHAROSE® [Ni ++ charged] Chelation resin (Pharmacia, Peapack, 
New Jersey) and pre-equilibrated with lysis buffer supplemented with 50mM 
imidazole. Following loading, the column was washed to baseline absorbance 
with equilibration buffer and a linear urea gradient (2M to 0). For elution the 
column was developed with a linear gradient from 50 to 500 mM Imidazole in 
50mM Tris pH =8.0, 150 mM NaCI, 10% glycerol and 30 ^M dexamethasone. 
Column fractions of interest were pooled and 500 units of thrombin protease 
(Amersham Pharmacia Biotech, Piscataway, New Jersey) were added for the 
cleavage of the fusion protein. 

This solution was then dialyzed against 1 liter of 50 mM Tris pH =8.0, 150 
mM NaCI, 10% glycerol and 20 hM dexamethasone for -10 hrs at 4°C. The 
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digested protein sample was filtered and then reloaded onto the same re- 
equilibrated column. The cleaved GR-LBD was collected in the flow through 
fraction. The diluted protein sample was concentrated with Centri-prep™ 10K 
centrifugal filtration devices (Amicon/Millpore, Bedford, Massachusetts) to a 
5 volume of 30m!s and then diluted 5 fold with 50 mM Tris pH=8.0, 10 % glycerol, 
10 mM DTT, 0.5 mM EDTA and 30 |iM dexamethasone. The sample was then 
loaded onto a pre-equilibrated XK-26 column (Pharmacia, Peapack, New Jersey) 
packed with Poros HQ resin (PerSeptive Biosystems, Framingham, 
Massachusetts). The cleaved GR LBD was collected in the flowthrough. The 

10 NaCI concentration was adjusted to 500mM and the dexamethasone 
concentration was adjusted to 50 jiM before the purified protein was concentrated 
to ~1 mg/ml using the Centri-prep™ 10K centrifugal filtration devices. 

Figure 1A depicts purification of £. coli expressed GR(521-777) F602S by 
SDS-PAGE. Lane 1 contains the insoluble pellet fraction. Lane 2 contains the 

15 soluble supernatant fraction. Lane 3 contains pooled eluent from intial Ni ++ 
column. Lane 4 contains the sample after thrombin digestion. Lane 5 contains 
the flow through fraction after reload of the Ni ++ column. Lane 6 contains the 
concentrated protein after anion exchange. The positions of molecular mass (kDa) 
markers (in Lane M, 94, 67, 43, 30, 20 and 14 kDa, respectively) and of the 

20 expressed protein are indicated to the left and right sides of the panel, 
respectively. Purfication provides for the removal of any remaining associated 
bacterial HSPs. 

The final resultant sequence (SEQ ID NO:32) of the purified protein is 
below. The first two residues (underlined and bolded) are vector derived and 
25 represent the remaining residues of the thrombin cleavage site following digestion. 

GSVPATLPQL TPTLVSLLEV IEPEVLYAGY DSSVPDSTWR IMTTLNMLGG 
RQVIAAVKWA KAIPGFRNLH LDDQMTLLQY SWMSLMAFAL GWRSYRQSSA 
NLLCFAPDLI INEQRMTLPC MYDQCKHMLY VSSELHRLQV SYEEYLCMKT 
30 LLLLSSVPKD GLKSQELFDE IRMTYIKELG KAIVKREGNS SQNWQRFYQL 
.TKLLDSMHEV VENLLNYCFQ TFLDKTMSIE FPEMLAEIIT NQIPKYSNGN 
IKKLLFHQK 
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Example 5 
Ligand and C oactivator Binding Of GR 
All experiments were conducted with buffer containing 10 mM HEPES pH 
7.4, 0.15 M NaCI, 3 mM EDTA, 0.005% p O |ysorbate-20 and 5 mM DTT For 
acfvrty determinations. 10 nM of fluorescein dexamethasone (Molecular Probes 
Eugene, Oregon) was titrated with increasing concentrations of the glucocorticoid 
receptor in black 96-well plates (CoStar, Cambridge, Massachusetts). The 
fluorescence polarization values for each concentration of receptor were 
determined using a BMG PolarStar Galaxy fluorescence plate reader (BMG 
Labtechnolog.es GmbH, Offenburg. Germany) with 485 nm excitation and 520 nm 
emission filters. Binding isotherms were constructed and apparent EC50 values 
were determined by non-linear least squares fit of the data to an equation for a 
s.mple 1:1 interaction. Note that these EC50 values are not corrected for the 
unlabeled dexamethasone present in the GR receptor preparations. For stability 
stud.es, the fluorescent polarization of 10 nM fluorescein dexamethasone with 1 
uM GST-GR LBD 521-777 (F602S) is read at specific time intervals in the 
presence or absence of 25 uM of a peptide derived from the coactivator TIF2 
(TIF2 732-756: QEPVSPKKKENALLRYLLDKDDTKD) (SEQ ID NO:17). 

Data from these experiments are presented graphically in Figures 2A-2C 
These studies demonstrate that the GST-GR fusion protein and the cleaved GR 
LBD alone bind dexamethasone in a saturable and competable manner (Figure 
2A). It was also found that the GST-GR fusion protein binds a peptide from the 
coacfvator TIF2 with a submicromolar affinity. Binding of the GST-GR fusion 
protein is enhanced by the agonist dexamethasone (DEX) and inhibited by the 
antagonist RU486 (Fig. 2B). Finally, rt was also found that the addition of the TIF2 
peptide stabilizes the dexamethasone binding activity of the GST-GR fusion 
protein. 

Figure 2B was generated using Biacore techniques. Biacore relies on 
changes in the refractive index at the surface layer upon binding of a ligand to a 
protein immobilized on the layer. In this system, a collection of small ligands is 
.njected sequentially in a 2-5 microliter cell, wherein the protein is immobilized 
wrthin the cell. Binding is detected by surface plasmon resonance (SPR) by 
recording laser light refracting from the surface. In general, the refractive index 
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change for a given change of mass concentration at the surface layer is practically 
the same for all proteins and peptides, allowing a single method to be applicable 
for any protein (Liedberg et al. (1983) Sensors Actuators 4:299-304; Malmquist 
(1993) A/a/are 361:186-187). The purified protein is then used in the assay 
5 without further preparation. A synthetic peptide with an amino-terminal biotin is 
coupled to a sensor chip immobilized with streptavidin. The chip thus prepared is 
then exposed to the potential ligand via the delivery system incorporated in the 
instruments sold by Biacore (Uppsala, Sweden) to pipet the ligands in a 
sequential manner (autosampler). The SPR signal on the chip is recorded and 
10 changes in the refractive index indicate an interaction between the immobilized 
target and the ligand. Analysis of the signal kinetics of on rate and off rate allows 
the discrimination between non-specific and specific interaction. 

Example 6 

15 Preparation of the GR/TIF2/Dex Complex 

The GR/TIF2/Dex complex was prepared by adding a 2-fold excess of a 
TIF2 peptide containing sequence of QEPVSPKKKENALLRYLLDKDDTKD (SEQ 
ID NO:17). The above complex was diluted 10 folds with a buffer containing 500 
mM ammonium acetate (NH 4 OAC), 50 mM Tris, pH 8.0, 10% glycerol, 10 mM 

20 dithiothreitol (DTT), 0.5 mM EDTA, and 0.05% beta-N-octoglucoside (b-OG), and 
was slowly concentrated to 6.3 mg/ml, then aliquoted and stored at-80°C. 

Example 7 
Crystallization and Data Collection 
25 The GR/TIF2/DEX crystals were grown at room temperature in hanging 

drops 

containing 3.0 ul of the above protein-ligand solutions, and 0.5 ul of well 
buffer (50mM HEPES, pH 7.5-8.5 (preferred pH range is 8.0 to 8.5), and 1.7-2.3M 
ammonium formate). Crystals were also obtained with mixing of the above protein 
30 solution and the well buffer at various volume ratios. Crystals appeared overnight 
and 

continously grew to a size up to 300 micron within a week. Before data collection, 
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crystals were transiently mixed with the well buffer that contained an additional 25 
% glycerol, and were then flash frozen in liquid nitrogen. 

The GR/TIF2/DEX crystals formed in the P6, space group, with a = b = 
126.014 A, c = 86.312 A, a = B =90-, and y =120°. Each asymmetry unit contains 
two molecules of the GR LBD with 5 6 o/ 0 of solvent content. Data were collected 
wrth a Rigaku Raxis IV detector in house. The observed reflections were reduced 
merged and scaled with DENZO and SCALEPACK in the HKL2000 package (Z 
Otwinowski and W. Minor (1997)). 



0 



Example 8 
Structure Det ermination and Refinement 
Table 5 is a table of the atomic structure coordinates used as the initial 
model to solve the structure of the GR/TIF2/dexamethasone complex by 
molecular replacement. The GR model is a homology model built on the 
published structure of the progesterone receptor LBD and the SRC1 coactivatbr 
peptide from the PPARa/Compound 1/SRC1 structure. 

Compound 1 is an agonist of hPPARa, and has the IUPAC name 2-methyl- 

2-[4-{[(4-methyl-2-[4-trifluoromethyl P henyl] thiazol-5-yl-carbonyl) amino] methyl} 
phenoxy] propionic acid. 




Compound 1 

The initial model for the molecular replacement calculation comprised 
coordinates for residues 527-776 of wild-type GR together with coordinates for 
residues 685-697 of SRC-1 , a coactivator very similar to TIF2. The model for GR 
was built from the crystal struture of PR bound to progesterone (Shawn P 
Williams and Paul B. Sigler, Nature 393. 392-396 (1998)) using the MVP program 
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(Lambert, 1997). The coordinates for SRC-1 were obtained from a crystal 
structure of PPARa bound to SRC-1. The SRC-1 model was positioned in the 
coactivator binding site of GR by rotating the GR model and PPARa/SRC-1 
complex into a common orientation that superimposed their backbone atoms. 
5 It is noted that the amino acid sequence for SRC-1 differs substantially 

from that of TIF2, although both coactivator sequences have the LXXLL motif. 
Model building, including conversion of side-chains from the SRC-1 and wild-type 
GR sequences to the actual TIF2 and GR F602S sequences, respectively, was 
carried out with QUANTA™. 

10 This model was used in molecular replacement search with the CCP4 

AmoRe™ program (Collaborative Computational Project Number 4, 1994, The 
CCP4 Suite: Programs for Protein Crystallography", Acta Cryst. D50, 760-763; 
J.Navaza, Acta Cryst A50 ( 157-163 (1994)) to determine the initial structure 
solutions. Two solutions were obtained from the molecular replacement search 

15 with a correlation coefficiency of 43% and an R-factor of 45.3%, consistent with 
two complexes within each asymmetry unit. The calculated phase from the 
molecular replacement solutions was improved with solvent flattening, histogram 
matching and the two-fold noncrystallographic averaging as implement in the 
CCP4 dm program, and produced a clear map for the GR LBD, the TIF2 peptide 

20 and the dexamethasone. As noted above, model building proceeded with 
QUANTA™, and refinement progressed with CNX (Accelrys, Princeton, New 
Jersey) and multiple cycle of manual rebuilding. The statistics of the structure are 
summarized in Table 3 and coordinates are presented in Figure 4. 

Surface areas calculated with the Connolly MS program (Michael L. 

25 Connolly, "Solvent-Accessible Surfaces of Proteins and Nucleic Acids," Science 
221, 709-713 (1983)) and the MVP program (Lambert, 1997). The pocket volume 
and binding site accessible waters were calculated with MVP. 

Example 9 

30 Random Mutant Library of GR LBD and Selection using the Lad Fusion System 

The expression vector pJS142A (Affymax Inc., Palo Alto, California) 
containing the Lacl protein was used to clone the wild type GR LBD in frame with 
the Lacl gene. Using standard error-incorporating PCR techniques, a random 
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mulant librae was created within the context of the GR LBD. An advantage of the 
Lacl expression system is that the protein expressed has the ability to bind the 
Plasm*. DNA from which it was derived. The mutant fusion proteins produced by 
the random library were expressed in B.Coli at 37'C. Lysis of the cell cultures 
was achieved using lysozyme. The cell lysates wane then added to a microtiter 
plate containing the immobilized reactivator peptide biotinylated-TIF2 NR Boxlll 
The plasmid DNA was eluted from the DNA-protein complex bound to the plate 
ustng 1mM IPTG (Life Technotogias,. The elated DNA was then re-transformed 
and individual clones ware isolated for sequence analysis. Mutant fusion proteins 
wrth mcreased solubility and activity (ability to bind reactivator) should be selected 
for after rounds of panning and increased stringency washes. Once the sequence 
of the mutant LacI-GR LBD was identified, the same mutation was also made in 
the pET24 expression vector (see Exampte 1). The expression and partial 
punficatton of the mutant LacWertved GST-GR LBD fusion proteins ware 
performed in the same manner as described in Examples 3 and 4. 

Figure 1 D depicts the partial purification of E. Cor, expressed GR (521-777) 
for several mutants isolated by the Lacl Fusion system. For solubility testing 
these mutants are expressed as a fusion to 6xHis-GST using the modified pET24 
expression vector. Continuing with Figure 1D. Lane 1 contains the soluble 
fraction of GST-GR (521-777, F602S, Lane 2: GR (521-777, wild type. Lane 3- 

GST-GR (521-777) A580T/F602L. Lane 4: GST-GR (521-777, A574T Lane 5 

GST-GR (521-777) Q615H. and Lane 6: GST-GR (521-777) Q615L. Molecular 
weight markers (kD) are shown in Lane M. 
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Table 3 

Statistics of Crystallographic Data and Structure 



Crystals 



GR/TIF2 with 
dexamethasone 



Space group 
resolution (A) 
Unique reflections ( N ) 
completeness (%) 

l/a (last shell) 

Rsym 3 (%) 

refinement statistics 
R factor* (%) 
R free (%) 
r.m.s.d. 

bond lengths (A) 
r.m.s.d. bond 
angles(degrees) 
Number of H20 
total non-hydrogen 
atoms 



P61 

20.0- 2.8 

18,923 

99.7 

25.6 (2.2) 
8.5 



33.4 
29.6 



0.015 



1.795 
53 

4444 



r.m.s.d is the root mean square deviation from ideal 
geometry. 

a R sym =1 1 \avg - 1/ 1 / 11/ 

^factor 23 I| Fp - Fpcaic | / HF P , where F p and F pca ic are 
observed and calculated structure factors, R f ree is 
calculated from a randomly chosen 8% of reflections 
that never be used in refinement and R factor is 
calculated for the remaining 92% of reflections. 



WO 03/015692 

PCT/US02/22648 

-111- 



10 



15 



REFERENCES 

The references listed below as well as all references cited in the 
spec.ficat.on are incorporated herein by reference to the extent that they 
supplement, explain, provide a background for or teach methodology, techniques 
and/or compositions employed herein. 
Altschul etaL, (1990) J. Mol. Biol. 215: 403-10 
Apriletti et_aL, (1995) Protein Expression and Purification, 6: 368-370 
Arth etaL, (1958) J. Am. Chem. Soc. 80: 3161 
Ausubel etaL, (1989) Current Protocols i n Molecular Biol, 
Bartlett etaL, (1989) SpecialPub., Royal Chem. Soc. 78: 182-96 
Beato, (1989) Cell 56:335-344 

Blundell & Johnson, (1985) Method.Enzymol., 1 14A & 1 15B 
Bohm, (1992) J. Comput. Aid. Mol. Des., 6: 61-78 
Brooks etaL, (1983) J. Comp. Chem., 8: 132 

Banger (1992) X-PLOR, Version 3.1. A System for X-ray Crystallography and 
NMR, Yale University Press, New Haven, Connecticut 
Case etaL, (1997), AMBER 5, University of California, San Francisco 
Cohen & Duke, (1 984) J. Immunol. 1 52: 38^2 
20 Cohen etaL, (1990) J. M ed. Chem. 33: 883-94 

Crameri, A., et al., Nature Biotechnology 14, 315-319. 

Creighton, (1983) Proteins: Structure an n m^..,^^,,. w , , f ' 

Co., New York 

Crystallography, Academic Press, San Diego. California 
Cull etaL, (1992) Proc.Natl. Acad. Sci. 89:1865-1869. 
Danielsen etaL, (1987) Molec. Endocrinol. 1: 816-822 
Danielsen etaL, (1989) Cancer Res. 49: 2286s-2291s 
Drewes etaL, (1996) Mol. Cell. Biol. 16:925-31 

Ducruix & Geige, (1992) Crystallization of Nucleic Acids and Proteins: * p^.- 
30 Approach, IRL Press, Oxford, England 

Eastman-Reks & Vedeckis, (1986) Cancer Res. 46: 2457-2462 
Eisen etaL, (1994). Proteins 19: 199-221 
Evans. (1988) Science 240:889-895 



25 



WO 03/015692 PCT/US02/22648 

-112- 

* 

Evans, (1989) in Recent Progress in Hormone Research (Clark, ed.) Vol. 45, pp. 

1-27, Academic Press, San Diego, California 

Fried & Sabo, (1954) J. Am Chem. Soc. 76: 1455 

Gampe et al. , (2000) Mol. Cell 5: 545-55 
5 Giguere et al. , (1986) Cell 46: 645-652 

Godowski et al. , (1987) Nature 325: 365-368 

Goodford, (1985) J. Med. Chem. 28: 849-57 

Goodsell & Olsen, (1990) Proteins 8: 195-202 

Green & Chambon, (1987) Nature 325: 75-78 
10 Gribskov et al. , (1986) Nucl. Acids. Res. 14: 6745 

Gruol etal, (1989) Molec. Endocrinol. 3: 21 19-2127 

Harmon et al. , (1979) J. Cell Physiol. 98: 267-278 

Hauptman, (1997) Curr. Opin. Struct. Biol. 7: 672-80 

Henikoff & Henikoff, (1989) Proc Natl Acad Sci U.S.A. 89: 10915 
1 5 Hirschman et al. , (1 956) J. Am. Chem. Soc. 78: 4957 

Hollenberg & Evans, (1988) Ce//55: 899-906 

Hollenberg etal,, (1987) Cell 49: 39-46 

Hollenberg et al. , (1989) Cancer Res. 49: 2292s-2294s 

Homo-Delarche, (1984) Cancer Res. 44: 431-437 
20 Janknecht, (1991) Proc. Natl. Acad. Sci. U.S.A. 88: 8972-8976 

Jung, S., Honegger, A., and Pluckthun, A. (1999) J. Mol. Biol. 294, 163-180 

Karlin and Altschul, (1993) Proc Natl Acad Sci U.S.A. 90: 5873-5887 

Kelso & Munck, (1 984) J. Immunol. 1 33:784-791 

Kuntz et al. , (1992) J. Mol. Biol. 161: 269-88 
25 Kyte & Doolittle, (1 982), J. Mol. Biol. 1 57: 1 05-1 32 

Lambert, (1997) in Practical Application of Computer-Aided Drug Design , 

(Charifson, ed.) Marcel-Dekker, New York, pp. 243-303 

Martin, (1992) J. Med. Chem. 35: 2145-54 

McConkey et al. , (1989) Arch. Biochem. Biophys. 269: 365-370 
30 McPherson, (1982) Preparation and Analysis of Protein Crystals , John Wiley, New 

York 

McPherson, (1990) Eur. J. Biochem. 189:1-23 
McRee, (1992) J. Mol. Graphics 10: 44-46 



15 



WO 03/015692 

PCT/US02/22648 

-113- 

McRe6> (1 " 3 ^ PEg^ProteinCrvstalloaranhy Academic Press, New York 
Miesfeld etaL , (1 987) Science 236:423-427 
Miranker & Karplus, (1991) Proteins 11: 29-34 
Navia & Murcko, (1992) Curr. Opin. Struc. Biol. 2: 202-10 
5 Needleman etaL, (1970) J. Mol. Biol. 48: 443 
Nicholls et al.. (1991 ) Proteins 1 1 : 281 
Nishibata & Itai, (1 991 ) Tetrahedron 47: 8985 
Nolte etaL, (1998) Nature 395:137-43 

Oberfield, J.L., etaL, Proc Natl Acad Sci USA. (1999) May 25; 96(1 1)6102-6 
10 Oliveto etaL, (1958) J. Am. Chem. Soc. 4431 
OroetaL, (1988) Ce//55: 1109-1114 

2. Otwinowski and W. Minor (1997), Methods in Enzymology, Volume 276 
Macromolecular Crystallography, part A, p.307-326, 1997.C.W. Carter, Jr. & R. M. 
Sweet, Eds., Academic Press (New York). 
Peariman etaL, (1 995) Comput. Phys. Commun. 91: 1-41 
Picard & Yamamoto, (1 987) EMBO J. 6: 3333-3340 
Picard eta}., (1 990) Cell Regul. 1 : 291 -299 
Pjura, P., and Matthews, B.W. (1993) Protein Science 2, 2226-2236 
Rarey etaL, (1996) J. Comput. Aid. Mol. Des. 10:41-54 

Rossmann, ed, (1972) The Molecular Repl acement Method Gordon & Breach 
New York 

Sambrook eiaL, (1989) Molecular Cloning- A Laboratory Manual Cold Spring 
Harbor Laboratory, New York 
Schatz etaL, (1996) Methods Enzymol. 267:171-191 

Schwartz elaL, eds., (1979), Atlas of Protein Sequ ence and Structure Ni„tinn a . 

Biomedical R esearch Foundation pp . 357-358 
Seielstad etaL, (1995) Mol. Endocrinol. 9: 647-658 
Sheldrick (1 990) Acta Cryst. A46: 467 
Shiau etaL, (1998) Ce//95: 927-37 
30 Sladek et al. , Genes Dev. 4:2353-65 

Smith etaL, (1 981 ) Adv. Appl. Math. 2:482 
Thompson, (1 989) Cancer Res. 49: 2259s-2265s. 
Umesono & Evans, (1 989) Cell 57: 1 139-1 146 



20 



25 



I 



WO 03/015692 PCT/US02/22648 

-114- 

Van Holde, (1971 ) Physical Biochemistry , Prentice-Hall, New Jersey, pp. 221-39 

Voegel etaL, (1998) EMBO J. 17: 507-519 

Weber, (1991)/\dv. Protein Chem. 41:1-36 

Weeks et al. , (1993) Acta Cryst. D49: 179 
5 Wellner, (1 971 ) Anal. Chem. 43: 597 

Wetmur & Davidson, (1968) J. Mol. Biol. 31: 349-70 

Wyckoff et al. , eds., Academic Press 

Yamamoto, (1985) Ann. Rev. Genet. 19: 209-252 

Yuh & Thompson, (1989) J. Biol. Chem. 264: 10904-10910 
10 U.S. Patent No. 3,007,923 

U.S. Patent No. 6,008,033 

U.S. Patent No. 4,554,101 

U.S. Patent No. 5,463,564 

U.S. Patent No. 5,834,228 
15 U.S. Patent No. 5.872,011 

U.S. Patent No. 6,236,946 

U.S. Patent No. 5,338,665 

WO 84/03564 

WO 99/26966 



WO 03/015692 



-115 



PCT/US02/22648 



TABLE 4 

ATOMIC STRUCTURE COORDINATE DATA OBTAINED FROM X-RAY 
DIFFRACTION FROM THE LIGAND BINDING DOMAIN OF GRa IN COMPLEX 

WITH DEXAMETHASONE 



ATOM 
ATOM I TYPE 



1 



8^ 
9 



10 



11 
12 



13 



14 



15 



19 



20 



21 



22 



23 



24 



25 
26 
27 



28 
29 
30 



31 



32 



33 



34 



35 
36 
37 



CB 



CG 
CD 



OE1 



NE2 



O 



N 

CA 



N 



CA 



CB 



PROTEIN 
RESIDUE | # | # 



GLN 



GLN 
GLN 



GLN 



GLN 



GLN 



GLN 



GLN 
GLN 



LEU 



LEU 



CG 



CD1 



CD2 



LEU 



LEU 



LEU 



O 
N 



CA 



LEU 
LEU 



LEU 
THR 



CB 
OG1 



CG2 



N 
CD 
CA 



THR 



THR 
THR 



THR 
THR 



THR 



CB 
CG 



O 



PRO 
PRO 
PRO 



PRO 
PRO 
PRO 



N_ 

CA 



CB 



OG1 



PRO 



THR 
THR 



THR 



THR 
THR 
THR 



527 



527 
527 



527 



527 



527 



527 



527 



527 



528 
528 



528 
528 



528 



528 
528 



528 
529 



529 
529 
529 



529 
529 



529 



530 
530 
530 



530 
530 
530 



530 



531 
531 



531 



531 
531 
531 



60.207 I 9.806 



60.501 I 11.318 
60.595 11.993 



60.493 I 13.224 



60.794 I 11.187 



35.497 



OCC B 



35.564 



34.172 



34.058 



62.073 I 8.590 



63.240 I 8.191 



61.009 I 7.618 
61.426 8.890 



33.121 



36.647 



36.724 



34.618 
35.289 



61 .308 I 8.776 ~3T71fi 
61.816 8.538 39.064 I l7i^~i"^ 



1.00 60.77 
1 .00 60.74 



1.00 63.52 



1.00 I 61 .80 



1.00 1 61.21 



1 .00 62.83 



1.00 I 59.67 



1.00 I 58.91 
TOO 62.13 



1.00 1 62.73 



.02 



62.105 I 9.889 39.733 100~T 6? 
62.864 I 10.872 | 38.813 I 1.00 I 59 ?5 



62.071 I 12.198 38.675 1 nn r^ ? " 



64.283 I 11.105 39.356 1 nn fin 64 



60.823 
60.586 



60.247 



59.282 
57.841 
57.918 



56.867 
59.134 



58.454 



59.743 
60.110 

59.753 
60.388 
59.914 
58.453 
57.400 



7.690 1 39.888 1,00 5938 
6.527 I 39.5 27 I 1.00 IfiS"^ 



8.256 | 40.960 I 1.00 I fifTZn 



7.53a 41.835 I 1.00 I60 79 
6.227 41.847 I 1.00 I 63^ 7 
9.S61 1 42.382 11.00 Ifinfin 



7.410 I 42.706 I 1.00 I 67nZ 



6.0fab | 41.397 I 1.00 I fii~3« 



5.754 I 40.398 I 1.00 I SQ~qT 



58.554 
57.455 
57.989 



57.209 
57.937 
56.629 



5.117 42.163 1.00 61/16 

5.411 43.563 1.00 6038 

3.660 41.928 1.00 62.39 

3.109 | 43.213 I 1.00 I Sft or 



4.0/1 44.249 1.00 1 6Ql 
2.927 I 41.537 I 1.00 lfi3 3Q 



3.542 | 41.363 I 1.00 IsoTF 



1.603 41.419 1.00 62 27 
0.742 40.997 1.00 6T68 
-0.404 | 40.058 I 1 .00 I fi7T5ft 



-0.461 38.853 1.00 60^2 5 
1.760 | 40.757 I 1.00 I fiTTfiT 



0.125 | 42.117 I 1.00 Ififift? 
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33.859 


1.00 


59.00 


J A XX A 

4464 


C13 


DEX 


1 


20.600 


22.429 


32.344 


1.00 


59.00 


A A /"> f 

4465 


X"X A\ A 

C14 


DEX 




22.105 


22.863 


32.515 


1.00 


59.00 


A A /*X /"» 

4466 


1 t A A 

H14 


DEX 




22.701 


21.953 


32.834 


1.00 


59.00 


4467 


XX ^ » 

C15 


DEX 


1 


22.602 


23.242 


31.129 


1.00 


59.00 


4468 


■ 1 A f* j 

H151 


DEX 


1 


23.685 


23.110 


31 .097 


1.00 


59.00 


4469 


H152 


DEX 


1 


22.383 


24.310 


30.934 


1.00 


59.00 


A A ~ » xx 

4470 


C16 


DEX 


1 


21.806 


22.306 


30.152 


1.00 


59.00 


A A A 

4471 


H16 


DEX 


1 


21.207 


22.984 


29.504 


1.00 


59.00 


A A TXX 

4472 


X^ A 

C17 


DEX 


1 


20.783 


AW*. A A aattm aW*. 

21.450 


31 .097 


1.00 


AM* AAA* AAA* AAA*. 

59.00 


A A TXX 

4473 


x^ ^ /x 
C18 


DEX 




19.540 


23.677 


31.944 


1.00 


59.00 


A A "9 A 

4474 


H181 


DEX 


1 


19.873 


24.157 


31.015 


1.00 


59.00 


A A — » 

4475 


i ■ a* A"\. aW**. 

H182 


DEX 


1 


18.547 


23.297 


31 .792 


1.00 


59.00 


4476 


i i At A-\. At***, 

H183 


DEX 


1 


19.525 


24.449 


32.700 


1.00 


59.00 


4477 


C19 


DEX 




20.959 


25.638 


36.205 


1.00 


59.00 


.4 A ^^Af*k 

4478 


1 1 At tt 9 * At 

H191 


DEX 


1 


21.232 


26.215 


35.303 


1 .00 


59.00 


A A ^»XX 

4479 


1 1 At A*K Apr* 

H192 


DEX 


1 


19.899 


25.426 


36.127 


1.00 


59.00 


4480 


f | At ***** a% w ** 

H193 


DEX 


1 


21.132 


26.270 


37.072 


1.00 


59.00 


A A XV A 

4481 


ttf 9 *^ A*K, f\. 

C20 


DEX 


1 


19.417 


21.067 


30.421 


1.00 


59.00 


At A J\. A***L 

4482 


A*** aM*. A\ 

C21 


DEX 


1 


18.443 


20.176 


31 .204 


1.00 


59.00 


4483 


■ 1 AP^* At A\ 

H211 


DEX 


1 


17.932 


20.800 


31 .959 


1.00 


59.00 


A A /X A 

4484 


H212 


DEX 


1 


19.031 


1 9.423 


31.779 


1.00 


59.00 


a A f\ r 

4485 


A^\ 

C22 


DEX 


1 


****** At*^ aA/^ *mtm a 

22.671 


21.454 


29.301 


1.00 


59.00 


A A f\ XX 

4486 


I I *~\ rx a 

H221 


DEX 


1 


22.061 


20.835 


*mWaM. AW*. A"*tk. A A 

28.644 


A AW* MmfA. 

1.00 


k\MM- AW*. AM*. AM*. 

59.00 


a a r\ 

4487 


H222 


DEX 


1 


23.334 


Af*M* A***. A*±. MmmmW W 

22.077 


AWtMrn. AW*. A0* aA+a. At**** 

28.688 


At AA±. aAAa. 

1.00 


mAAAT jam* A*a A—* 

59.00 


4488 


H223 


DEX 


1 


AW** /\ X*\ X^ 

23.300 


aw*. aP*± *mww f\. 

20.785 


29.933 


A df\. Af* 

1.00 


AAA, AW* At* A*a. 

59.00 


A A f\ 

4489 


F1 


DEX 


1 


Mam At ^X 

22.519 


dAmt. jAUfA. a\ Aftm. Am^ 

22.128 


aw*, mmm aWa\ ammm 

35.397 


A aAWx AA»a. 

1.00 


mMMB aA*. aA* aA* 

59.00 


4490 


X"V J 

01 


DEX 


1 


24.201 


23.808 


39.692 


A Ajm**. AM* 

1.00 


9MW> aA*\ a%A* Jh 

59.00 


A A XX A 

4491 


X*XXX 

02 


DEX 


1 


19.179 


A****, tx^tk. mm jAmm. 

23.598 


jmmmt. A aAmal aAW± Mmm 

34.905 


A **** aWMa. 

1.00 


59.00 


A A /"XXX 

4492 


1 1 XX /X 

H02 


DEX 


1 


18.367 


23.168 


34.580 


A AW\ AM* 

1.00 


59.00 


4493 


03 


r> r - \x 

DEX 




21.444 


20.210 


l A CCA 

31 .554 


1.00 


59.00 


4494 


H3 


DEX 




21.502 


19.648 


30.802 


1.00 


59.00 


4495 


04 


DEX 




19.127 


21.505 


29.299 


1.00 


59.00 


4496 


05 


DEX 




17.530 


19.572 


30.381 


1.00 


59.00 


4497 


H5 


DEX 




17.435 


18.711 


30.744 


1.00 


59.00 
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TABLE 5 

ATOMIC COORDINATES FOR THE GR/SRC-1 MODEL USED IN MOLECULAR 

REPLACEMENT 



ATOM 



1 



2 
3 
4 



7_ 
8 



10 



11 
12 



13 



14 



15 




25 



26 



27 



28 



31 
32 



33 



34 



35 



36 



ATOM 



TYPE RESIDUE 



N 



CA 



O 



CB 



CG 



CD 
OE1 



N 



CA 



CB 



CG 



16 I CD1 



17 I CD2 



N 

CA 
C 



CB 



N 



CA 



0 



30 I CG 



N 



CA 



GLN 
GLN 



GLN 
GLN 



CB 



37 I OG1 



GLN 
GLN 



GLN 
GLN 



NE21 GLN 



LEU 



LEU 
LEU 



LEU 



LEU 



LEU 



LEU 



LEU 



THR 
THR 
THR 



THR 
THR 



23 1 OG1 | THR 



PRO 



PRO 
PRO 



PRO 
PRO 
PRO 
PRO 
THR 



THR 



THR 
THR 



THR 



THR 



38 CG2 THR' 

39 I N I LEU 



PROTEIN 



527 

527 



527 
527 



527 
527 



527 
527 
527 
528 



528 
528 



528 



528 
528 



528 



528 



529 
529 
529 



529 
529 
529 
529 
530 



530 
530 



# 

-10.228 
-10.481 



-9.230 
-9.189 



-10.824 
-11.131 
-11.424 



-11.629 
-1 1 .432 
-8.211 
-6.966 
-5.949 



-5.120 



-6.361 
-7.168 



40.054 
38.584 



15.641 
1 5.329 



37.821 I 15.751 
37.229 I 16.832 



38.264 I 13.878 
36.765 13^555 
36.357 12.106 



1.00 
1.00 



35.191 I 11.807 
37.263 11.161 
37.835 14.896 | 1.00 
37.146 15.198 | 1.00 
38.070 | 15.865TT00 



1.00 
1.00 



1.00 
1.00 



1.00 



1.00 
1.00 



37.612 | 16.653 I 1.00 



36.538 13.925 | 1.00 
35.430 I 13.235 I 1.00 



-6.400 34.910 12.020 Tnn 



OCC 



69.36 



66.54 



66.47 
66.82 



68.47 



99.90 
99.90 



99.90 
99.90 



63.30 
60.85 



56.94 



54.60 



-7.42b | 34.291 I 14 214 I 1 



-6.012 | 39.362 I 15.551 I 1 



.00 



00 



-5.083 40.319 16.141 TOO 
-5.489 | 40.584 I 17.58QTT7in 



-6.59o | 41.U44 1 17.853 I 1 nn 



-5.082 41.664 15.381 TOO 

-4.666 41.475 14.034 1 00 

-4.139 42.758 15.927 fTOO 

-4.595 I 40.292 I 18.548 I 1 nn 



-4.883 | 40.507 I 19.968 l^~nn 



-5.301 | 41.950 I 20.272 



-4.811 | 42.889 I 19.648 



1.00 
1.00 



-3.5/u 4U.108 1 20.640 1 1.0 0 
-3.073 | 39.021 I 19.725 I 1~n7T 



-3.240 | 39.737 I 18.398 I Tnn 



530 
530 
530 

530 

531 I -6.206 1 42.135 I 21.243 I i nn 
531 -b./22 43.444 21.654 I 1 nn 
531 -s.642 44.469 |~2T993 
531 I -5.687 45.610 f2T527 
531 I -7.584 43.099 | 22.866 
531 I ^J643 j 42.227 22.491 



61.13 
66.50 
60.00 
59.53 



52.97 



48.69 



46.25 



41.04 



52.18 



99.90 
99.90 
40.66 



39.82 



36.13 
35.64 



531 -8,285 44.282 [ 231367 

532 | -4.676 44.05R JiTm* 



U)0 

1.1 



34.22 
43.36 



40.38 



35.29 
35.12 



30.M 
.64 
36.34 
99.90 



99.90 
29.04 
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40 


CA 


LEU 


532 


-3.597 


44.958 


23.211 


1.00 


28.23 


41 


C 


LEU 


532 


-2.763 


45.434 


22.022 


1.00 


26.62 


42 


0 


LEU 


532 


-2.299 


46.580 


21.984 


1.00 


25.82 


43 


CB 


LEU 


532 


-2.702 


44.274 


24.232 


1.00 


25.68 


44 


CG 


LEU 


532 


-1.563 


45.146 


24.757 


1.00 


34.63 


45 


CD1 


LEU 


532 


-2.111 


46.509 


25.197 


1.00 


30.55 


46 


CD2 


LEU 


532 


-0.867 


44.418 


25.902 


1.00 


30.65 


47 


N 


VAL 


533 


-2.571 


44.555 


21.045 


1.00 


27.06 


48 


CA 


VAL 


533 


-1.809 


44.925 


19.863 


1.00 


23.18 


49 


C 


VAL 


533 


-2.593 


45.921 


19.014 


1.00 


24.05 


50 


0 


VAL 


533 


-2.030 


46.890 


18.496 


1.00 


26.77 


51 


CB 


VAL 


533 


-1 .442 


43.683 


19.053 


1.00 


23.51 


52 


CG1 


VAL 


533 


-0.483 


42.716 


19.788 


1.00 


99.90 


53 


CG2 


VAL 


533 


-0.787 


43.933 


17.666 


1.00 


99.90 


54 


N 


SER 


534 


-3.900 


45.708 


18.871 


1.00 


25.92 


55 


CA 


SER 


534 


-4.703 


46.659 


18.103 


1.00 


27.71 


56 


C 


SER 


534 


-4.657 


48.017 


18.811 


1.00 


22.00 


57 


0 


SER 


534 


-4.612 


49.063 


18.165 


1.00 


26.26 


58 


CB 


SER 


534 


-6.156 


46.179 


17.998 


1.00 


31.49 


59 


OG 


SER 


534 


-6.853 


46.235 


19.247 


1.00 


99.90 


60 


N 


LEU 


535 


-4.662 


48.000 


20.140 


1.00 


26.88 


61 


CA 


LEU 


535 


-4.620 


49.258 


20.894 


1.00 


25.43 


62 


C 


LEU 


535 


-3.296 


49.974 


20.628 


1.00 


27.05 


63 


0 


LEU 


535 


-3.273 


51.177 


20.377 


1.00 


26.07 


64 


CB 


LEU 


535 


-4.802 


48.981 


22.385 


1.00 


26.35 


65 


CG 


LEU 


535 


-4.863 


50.186 


23.336 


1.00 


35.60 


66 


CD1 


LEU 


535 


-5.553 


49.756 


24.633 


1.00 


36.71 


67 


CD2 


LEU 


535 


-3.464 


50.735 


23.618 


1.00 


30.46 


68 


N 


LEU 


536 


-2.197 


49.230 


20.652 


1.00 


25.73 


69 


CA 


LEU 


536 


-0.883 


49.817 


20.384 


1.00 


23.64 


70 


C 


LEU 


536 


-0.843 


50.404 


18.977 


1.00 


27.62 


71 


0 


LEU 


536 


-0.242 


51 .450 


18.756 


1.00 


22.81 


72 


CB 


LEU 


536 


0.221 


48.764 


20.527 


1.00 


24.64 


73 


CG 


LEU 


536 


0.433 


48.131 


21 .906 


1.00 


25.70 


74 


CD1 


LEU 


536 


1.559 


47.084 


21 .835 


1.00 


21.63 


75 


CD2 


LEU 


536 


0.782 


49.226 


22.923 


1.00 


20.83 


76 


N 


GLU 


537 


-1 .455 


49.717 


18.013 


1.00 


24.62 


77 


CA 


GLU 


537 


-1 .488 


50.230 


1 6.646 


1.00 


27.60 


78 


C 


GLU 


537 


-2.257 


51 .555 


1 6.668 


1.00 


27.94 


79 


0 


GLU 


537 


-1 .850 


52.543 


1 6.060 


1.00 


25.86 


80 


CB 


GLU 


537 


-2.207 


49.232 


1 5.730 


1.00 


27.45 


81 


CG 


GLU 


537 


-2.284 


49.639 


14.284 


1.00 


39.52 


82 


CD 


GLU 


537 


-3.073 


48.750 


13.320 


1.00 


99.90 


83 


OE1 


GLU 


537 


-3.217 


49.017 


12.134 


1.00 


99.90 


84 


OE2 


GLU 


537 


-3.596 


47.637 


1 3.905 


1.00 


99.90 


85 


N 


VAL 


538 


-3.358 


51 .575 


1 7.406 


1.00 


25.24 


86 


CA 


VAL 


538 


-4.180 


52.769 


1 7.476 


1.00 


31.97 


87 


C 


VAL 


538 


-3.512 


53.961 


18.152 


1.00 


29.88 



WO 03/015692 



PCT/US02/22648 



-211- 




WO 03/015692 



-212 



PCT/US02/22648 



136 


CB 


mm* a 

LEU 


544 


7.089 


63.668 


14.010 


1.00 


28.58 


137 


CG 


LEU 


544 


6.289 


64.066 


15.279 


1.00 


99.90 


A A*^ #X 

138 


CD1 


LEU 


544 


5.742 


65.51 1 


15.198 


1.00 


99.90 


139 


CD2 


LEU 


544 


5.145 


63.088 


1 5.608 


1.00 


99.90 


A A J-^ 

140 


N 


TYR 


545 


7.000 


65.477 


1 1 .239 


1.00 


23.87 


AAA 

141 


A 

CA 


TYR 


545 


7.839 


66.165 


10.260 


1.00 


27.94 


A A r\ 

142 


c 


TYR 


545 


8.960 


66.895 


10.979 


1.00 


28.04 


At A #^ 

143 


0 


TYR 


545 


8.794 


67.338 


12.116 


1.00 


24.24 


.4 A A 

144 


a***. wm± 

CB 


TYR 


545 


7.010 


67.159 


9.460 


1.00 


27.40 


A m\ Am 

145 


CG 


TYR 


545 


6.083 


66.476 


8.487 


1.00 


34.60 


A A aA\ 

146 


A***, -4 

CD1 


*mmm\ a* 

TYR 


545 


4.825 


66.038 


8.889 


1.00 


37.81 


A A *V 

147 


CD2 


A) mmm*. 

TYR 


545 


6.489 


66.207 


7.181 


1.00 


38.29 


A a a** 

148 


CE1 


TYR 


545 


3.992 


65.348 


8.016 


1.00 


47.26 


At A aW\ 

149 


CE2 


mmmm. at 

TYR 


545 


5.661 


65.516 


6.295 


1.00 


39.38 


A aTS 

150 


cz 


TYR 


545 


4.414 


65.090 


6.724 


1.00 


41.71 


151 


OH 


TYR 


545 


3.599 


64.389 


5.864 


1.00 


52.51 


152 


N 


ALA 


546 


10.110 


67.022 


10.328 


1.00 


23.60 


153 


CA 


ALA 


546 


11.213 


67.720 


1 0.964 


1.00 


26.37 


a\ ^m a 

154 


C 


ALA 


546 


11.100 


69.231 


10.756 


1.00 


29.47 


At mm m~ 

155 


0 


ALA 


546 


1 1 .688 


70.011 


11.510 


1.00 


28.14 


156 


CB 


A A A 

ALA 


546 


12.542 


67.231 


10.418 


1.00 


27.79 


At mm 

157 


A ft 

N 


GLY 


547 


10.332 


69.635 


9.749 


1.00 


29.99 


A A*^ 

158 


CA 


GLY 


547 


10.213 


71.051 


9.439 


1.00 


32.21 


a Am a*a. 

159 


C 


aj^a. m m A 

GLY 


547 


11.541 


71.501 


8.836 


1.00 


39.13 


At A*A. AAA. 

160 


0 


GLY 


547 


1 1 .964 


72.645 


8.992 


1.00 


40.76 


At A*^. A 

161 


N 


TYR 


548 


12.206 


70.581 


8.140 


1.00 


38.38 


162 


CA 


TYR 


548 


13.505 


70.850 


7.528 


1.00 


46.41 


A aaa. a** 

163 


c 


TYR 


548 


1 3.429 


71.638 


6.208 


1.00 


47.68 


At A 

164 


0 


TYR 


548 


12.536 


71.420 


5.391 


1.00 


49.96 


165 


CB 


TYR 


548 


14.242 


69.521 


7.333 


1.00 


42.73 


A A*\ j*% 

166 


CG 


TYR 


548 


15.579 


69.661 


6.681 


1.00 


48.58 


A A*\. 

167 


m^A± A 

CD1 


TYR 


548 


16.740 


69.612 


7.459 


1.00 


99.90 


A\ At^ Aft*. 

168 


a*am. mw^ 

CD2 


TYR 


548 


15.683 


69.849 


5.299 


1.00 


99.90 


169 


a*** m^ a\ 

CE1 


TYR 


^m a AVm± 

548 


17.990 


69.755 


6.864 


1.00 


99.90 


170 


Aim* mm* Am\ 

CE2 


TYR 


548 


16.935 


69.992 


4.706 


A\ At*. AW^. 

1.00 


99.90 


At m *9* At 

171 


cz 


TYR 


MM* a Amu 

548 


18.085 


69.945 


5.488 


1.00 


99.90 


172 


OH 


TYR 


548 


19.311 


70.089 


4.902 


1.00 


99.90 


At *■ 

173 


N 


m a****, 

ASP 


549 


14.389 


72.543 


6.016 


1.00 


51.90 


174 


CA 


a AWmj, mmAv 

ASP 


^m A AAMk. 

549 


a A A Am*, mm 

14.465 


73.400 


M Mmt. AAm. AmA. 

4.832 


1.00 


52.61 


175 


C 


ASP 


mi ■ A Am* 

549 


A A A% A***. Ami 

14.420 


^^m awam. Am* ^m a 

72.658 


Awm. A AmA. Am*. 

3.499 


1.00 


mm a Am*. 

54.82 


176 


A^ 

0 


ASP 


549 


A 4^ A M^S A 

13.434 


■ m Am*. mmm *m> mm 

72.755 


Mmm. W A***. aPSl 

2.768 


1.00 


57.82 


177 


CB 


a. A***, mm± 

ASP 


*Mm> A A*K 

549 


15.727 


*Amj a A***. mm> ma »m 

74.257 


A AMA^ av*m. AT*. 

4.903 


A AAA. AAA. 

1.00 


52.84 


178 


CG 


ASP 


549 


A 4Mm AS. A*\. A 

15.881 


mm) gm a mmm\ 

75.347 


A AWAa AAK AAA. 

3.832 


A\ m*K AMk 

1.00 


AAA. — MAm. Am k 

99.90 


179 


OD1 


ASP 


549 


16.948 


75.607 


3.295 


1.00 


99.90 


180 


OD2 


ASP 


549 


14.700 


75.968 


3.534 


1.00 


99.90 


181 


N 


SER 


550 


15.499 


71.940 


3.190 


1.00 


55.70 


182 


CA 


SER 


550 


15.638 


71.176 


1.951 


1.00 


56.60 


183 


C 


SER 


550 


16.129 


72.063 


0.803 


1.00 


60.14 
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184 
185 



186 



187 



188 



189 



190 
191 



192 



193 



CD 
CB 
OG 
N 



SER 



CA 



£ 
Q 
CB 
OG 



SER 
SER 
SER 



SER 
SER 



N 



194 CA 



195 



196 
197 



O 
CB 
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It will be understood that various details of the invention may be changed 
without departing from the scope of the invention. Furthermore, the foregoing 
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description is for the purpose of illustration only, and not for the purpose of 
5 limitation — the invention being defined by the claims. 
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CLAIMS 

What is claimed is: 

1. A method of modifying a test NR polypeptide, the method 
comprising: 

(a) providing a test NR polypeptide sequence having a characteristic 
that is targeted for modification; 

aligning the test NR polypeptide sequence with at least one 
reference NR polypeptide sequence for which an X-ray structure is 
available, wherein the at least one reference NR polypeptide 
sequence has a characteristic that is desired for the test NR 
polypeptide; 

(c) building a three-dimensional model for the test NR polypeptide using 
the three-dimensional coordinates of the X-ray structure(s) of the at 
least one reference polypeptide and its sequence alignment with the 
test NR polypeptide sequence; 

(d) examining the three-dimensional model of the test NR polypeptide 
for a difference in an amino acid residue as compared to the at least 
one reference polypeptide, wherein the residues are associated with 
the desired characteristic; and 

(c) mutating an amino acid residue in the test NR polypeptide sequence 
located at a difference identified in step (d) to a residue associated 
with the desired characteristic, whereby the test NR polypeptide is 
modified. 

2. The method of claim 1, wherein the reference NR polypeptide 
sequence is a PR sequence, and wherein the test polypeptide sequence is a GR 
polypeptide sequence. 

3. The method of claim 1 , wherein the polypeptide of a crystalline GR 
LBD is used as the reference polypeptide sequence. 

4. The method of claim 1, wherein the method is carried out in a 
bacterial expression system. 
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5. The method of claim 1 , wherein the bacteria is E. coli. 

6. A method for modifying a test NR polypeptide to improve the 
solubility, stability in solution and other solution behavior, to alter and preferably 
improve the folding and stability of the folded structure, to alter and preferably 
improve the ability to form ordered crystals, or combination thereof, the method 
comprising: 

(a) providing a test NR polypeptide sequence for which the solubility, 
stability in solution, other solution behavior, tendency to fold 
properly, ability to form ordered crystals, or combination thereof is 
different from that desired; 

(b) aligning the test NR polypeptide sequence with the sequences of 
one or more reference NR polypeptides for which the X-ray structure 
is available and for which the solution properties, folding behavior 
and crystallization properties are closer to those desired; 

(c) building a three-dimensional model for the test NR polypeptide using 
the three-dimensional coordinates of the X-ray structure(s) of the 
one or more of reference polypeptides and their sequence alignment 
with the test NR polypetide sequence; 

(d) examining the three-dimensional model of the test NR polypeptide 
for lipophilic side-chains that are exposed to solvent, for clusters of 
two or more lipophilic side-chains exposed to solvent, for lipophilic 
pockets and clefts on the surface of the protein model, for sites on 
the surface of the protein model that are more lipophilic than the 
corresponding sites on the structure(s) of the reference NR 
polypeptide(s), or combinations thereof; 

(e) for each residue identified in step (d), mutating the amino acid to an 
amino acid with different hydrophilicity, whereby the exposed 
lipophilic sites are reduced, and the solution properties improved; 

(f) examining the three-dimensional model at each site where the 
amino acid in the test NR polypeptide is different from the amino 
acid at the corresponding position in the reference NR polypeptide, 
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and checking whether the amino acid in the test NR polypeptide 
makes favorable interactions with the atoms that lie around it in the 
three-dimensional model, considering the side-chain conformations 
predicted in step (c), considering alternative conformations of the 
side-chains, considering the presence of water molecules, or 
combinations thereof; 
(9) for each residue identified in step (f) as not making favorable 
interactions with the atoms that lie around it, mutating the residue to 
another amino acid that makes favorable interactions with the atoms 
that lie around it, thereby promoting the tendency for the test NR 
polypeptide to fold into a stable structure with improved solution 
properties, less tendency to unfold, and greater tendency to form 
ordered crystals; 

examining the three-dimensional model at each residue position 
where the amino acid in the test NR polypeptide is different from the 
amino acid at the corresponding position in the reference NR 
polypeptide, and checking whether the steric packing, hydrogen 
bonding and other energetic interactions could be improved by 
mutating that residue or any one or more of the surrounding 
residues lying within 8 angstroms in the three-dimensional model; 
(i) for each residue position identified in step (h) as potentially allowing 
an improvement in the packing, hydrogen bonding and energetic 
interactions, mutating those residues individually or in combination 
to residues that improve the packing, hydrogen bonding, energetic 
interactions, and combinations thereof, thereby promoting the 
tendency for the test NR polypeptide to fold into a stable structure 
with improved solution properties, less tendency to unfold, and 
greater tendency to form ordered crystals. 



(h) 



30 7. 



The method of claim 6, further comprising optimizing the side-chain 
conformations in the three-dimensional model of the test NR polypeptide by 
generating many alternative side-chain conformations, refining by energy 
minimization, and selecting side-chain conformations with lower energy. 
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8. The method of claim 6, wherein the mutating of step (e) further 
comprises a mutation to a more hydrophilic amino acid. 

5 9. The method of claim 6, wherein the reference NR polypeptide is PR, 

and wherein the test NR polypeptide is GRa 



10. The method of claim 6, wherein the reference NR polypeptide is 
1 0 GRa, and wherein the test NR polypeptide is GRP or MR. 

11. The method of claim 6, wherein the method is carried out in a 
bacterial expression system. 

15 12. The method of claim 6, wherein the bacteria is £. co//\ 

13. An isolated GR polypeptide comprising a mutation in a ligand 
binding domain, wherein the mutation alters the solubility of the ligand binding 
domain. 

20 

14. An isolated GR polypeptide, or functional portion thereof, having one 
or more mutations comprising a substitution of a hydrophobic amino acid residue 
by a hydrophilic amino acid residue in a ligand binding domain. 

* 

25 15. The isolated polypeptide of claims 13 or 14, wherein the mutation is 

at a residue selected from the group consisting of V552, W557, F602, L636, Y648, 
W712, L741, L535, V538, C638, M691, V702, Y648, Y660, L685, M691, V702, 
W712, L733, Y764 and combinations thereof. 

30 16. The isolated polypeptide of claims 13 or 14, wherein the mutation is 

selected from the group consisting of V552K, W557S, F602S, F602D, F602E, 
F602Y, F602T, F602N, F602C, L636E, Y648Q. W712S, L741R, L535T, V538S, 
C638S, M691T, V702T, W712T and combinations thereof. 
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17. An isolated GR LBD polypeptide, or ftmctional portion thereof 
havtng a F602S mutation or a F602D mutation, or a phenylalanine to serine or 
phenylalanine to aspartio acid mutation at an analogous position in the sequence 

5 in any polypeptide based on sequence alignment to GRa. 

18. The isolated polypeptide of claim 17, wherein the polypeptide has 
the sequence of SEQ ID NO: 12 or 14. 

10 

1 9. An isolated nucleic acid molecule encoding a GR polypeptide of any 
of claims 13-18. 

20. A chimeric gene, comprising the nucleic acid molecule of claim 19 
15 operably linked to a heterologous promoter. 

21 . A vector comprising the chimeric gene of claim 20. 



20 



25 



22. A host cell comprising the chimeric gene of claim 20. 

23. A method of detecting a nucleic acid molecule that encodes a GR 
polypeptide, the method comprising: 

(a) procuring a biological sample comprising nucleic acid material; 

(b) hybridizing the nucleic acid molecule of claim 19 under stringent 
hybridization conditions to the biological sample of (a), thereby 
forming a duplex structure between the nucleic acid of claim 19 and 
a nucleic acid within the biological sample; and 

(c) detecting the duplex structure of (b), whereby a GR encoding nucleic 
acid molecule is detected. 
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24. An antibody that specifically recognizes a GR polypeptide of any of 
claims 13-18. 



25. A method for producing an antibody that specifically recognizes a 
GR polypeptide, the method comprising: 

(a) recombinantly or synthetically producing a GR polypeptide of any of 
claims 13-18, or portion thereof; 

(b) formulating the polypeptide of (a) whereby it is an effective 
immunogen; 

(c) administering to an animal the formulation of (b) to generate an 
immune response in the animal comprising production of antibodies, 
wherein antibodies are present in the blood serum of the animal; and 

(d) collecting the blood serum from the animal of (c), the blood serum 
comprising antibodies that specifically recognize a GR polypeptide. 



26. A method for detecting a level of GR polypeptide, the method 
comprising: 

(a) obtaining a biological sample comprising peptidic material; and 

(b) detecting a GR polypeptide in the biological sample of (a) by 
immunochemical reaction with the antibody of claim 24, whereby an 
amount of GR polypeptide in a sample is determined. 



27. A method for identifying a substance that modulates GR LBD 
function, the method comprising: 

(a) isolating a GR LBD polypeptide of any of claims 13-18; 

(b) exposing the isolated GR polypeptide to a plurality of substances; 

(c) assaying binding of a substance to the isolated GR polypeptide; and 
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(d) selecting a substance that demonstrates specific binding to the 
isolated GR LBD polypeptide. 



28. A substantially pure GR ligand binding domain polypeptide 
5 crystalline form. 



in 



29. The polypeptide of claim 28, wherein the crystalline form comprises 
lattice constants of a = b =126.014 A, c = 86.312 A, a = 90°, B = 90°, y = 120°. 



10 30. 



The polypeptide of claim 28 or 29, wherein the crystalline form is a 
hexagonal crystalline form. 



31 . The polypeptide of claim 28 or 29, wherein the crystalline form has a 
space group of P61. 

15 

32. The polypeptide of claim 28 or 29, wherein the GRa ligand binding 
domain polypeptide has the amino acid sequence shown in any one of SEQ ID 
NOs:12, 14, 16 and 31. 

20 33. The polypeptide of claim 28 or 29, wherein the GR ligand binding 

domain polypeptide is in complex with a ligand. 

34. The polypeptide of claim 33, wherein the ligand is a steroid. 



25 35. 



The polypeptide of claim 34, wherein the steroid is dexamethasone. 



30 



36. The polypeptide of claim 28 or 29, wherein the GR ligand binding 
domain polypeptide is in complex with a ligand and a peptide. 

37. The polypeptide of claim 36, wherein the ligand is a steroid. 



The polypeptide of claim 37, wherein the steroid is dexamethasone 
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39. The polypeptide of claim 38, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

40. The polypeptide of claim 36, wherein the ligand is a steroid and the 
5 peptide is a fragment of a co-repressor. 

41. The polypeptide of claim 36, wherein the ligand is dexamethasone 
and the 

peptide comprises an LXXLL (SEQ ID NO:18) motif. 

10 

42. The polypeptide of claim 36, wherein the peptide is a fragment of a 
TIF2 protein. 



43. The polypeptide of claim 42, wherein the ligand is dexamethasone 
15 and the peptide has the amino acid sequence shown in any one of SEQ ID 

NO:17. 

44. The polypeptide of claim 28 or 29, wherein the GR ligand binding 
domain has a crystalline structure further characterized by the atomic structure 

20 coordinates shown in Table 4. 

45. The polypeptide of claim 28 or 29, wherein the crystalline form 
contains two GRa ligand binding domain polypeptide in the asymmetric unit. 

25 46. The polypeptide of claim 28 or 29, wherein the crystalline form is 

such that the three-dimensional structure of the crystallized GR ligand binding 
domain polypeptide can be determined to a resolution of about 2.8 A or better. 

47. The polypeptide of claim 28 or 29, wherein the crystalline form 
30 contains one or more atoms having a molecular weight of 40 grams/mol or 
greater. 
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48. A method for determining the three-dimensional structure of a 
crystallized GR ligand binding domain polypeptide to a resolution of about 2.8 A or 
better, the method comprising: 

(a) crystallizing a GR ligand binding domain polypeptide; and 

(b) analyzing the GR ligand binding domain polypeptide to determine 
the three-dimensional structure of the crystallized GR ligand binding 
domain polypeptide, whereby the three-dimensional structure of a 
crystallized GR ligand binding domain polypeptide is determined to a 
resolution of about 2.8 A or better. 

49. The method of claim 48, wherein the analyzing is by X-ray 
diffraction. " 



50. The method of claim 48. wherein the crystallization is accomplished 
> by the hanging drop method, and wherein the GR ligand binding domain is mixed 

with a reservoir. 

51. The method of claim 50, wherein the reservoir comprises 50mM 
HEPES, pH 7.5-8.5, and 1.7-2.3M ammonium formate. 

52. The method of claim 48, wherein the crystallizing further comprises 
crystallizing the GRa ligand binding domain with a ligand and a peptide. 

53. The method of claim 52, wherein the ligand is a steroid. 

54. The method of claim 53, wherein the ligand is dexamethasone. 

55. The method of claim 52, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

56. The method of claim 52, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 
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57. The method of claim 52, wherein the ligand is dexamethasone and 

the 

peptide comprises an LXXLL (SEQ ID NO:18) motif. 

5 58. The method of claim 52, wherein the peptide is a fragment of a TIF2 

protein. 

59. The method of claim 52, wherein the ligand is dexamethasone and 
the peptide has the amino acid sequence shown in SEQ ID NO:17. 

10 

60. A method of generating a crystallized GR ligand binding domain 
polypeptide, the method comprising: 

(a) incubating a solution comprising a GR ligand binding domain with a 
reservoir; and 

15 (b) crystallizing the GR ligand binding domain polypeptide using the 

hanging drop method, whereby a crystallized GR ligand binding 
domain polypeptide is generated. 

61. The method of claim 60, wherein the incubating further comprises 
20 incubating the GR ligand binding domain with a ligand and a peptide. 

62. The method of claim 61 , wherein the ligand is a steroid. 

63. The method of claim 62, wherein the steroid is dexamethasone. 

25 

64. The method of claim 61, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

65. The method of claim 61, wherein the ligand is a steroid and the 
30 peptide is a fragment of a co-repressor. 

66. The method of claim 61, wherein the ligand is dexamethasone and 
the peptide comprises an LXXLL (SEQ ID NOf18) motif. 
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67. The method of claim 61 . wherein the peptide is a fragment of a TIF2 
protein. 



68. A crystal|i2ed GRa |jgand b . ndjng doma . n po|ypeptjde 
the method of claim 60. 

69. A method of designing a modulator of a nuclear receptor the 
method comprising: 

(a) designing a potential modulator of a nuclear receptor that will make 
interactions with amino acids in the ligand binding site of the nuclear 
receptor based upon the atomic structure coordinates of a GR ligand 
binding domain polypeptide; 

(b) synthesizing the modulator; and 
5 (c) determining whether the potential modulator modulates the activity 

of the nuclear receptor, whereby a modulator of a nuclear receptor is 
designed. 

70 - The method <* <**i 69. wherein the atomic structure coordinates 
farther comprises a ligand and a peptide bound to the GR ligand binding domain 
polypeptide. 

71. The method of claim 69, wherein the atomic structure coordinates 
are the atomic structural coordinates shown in Table 3. 

72. The method of claim 70, wherein the ligand is a steroid. 

73. The method of claim 72, wherein the steroid is dexamethasone. 

74. The method of claim 70, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 
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75. The method of claim 70, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 

76. The method of claim 70, wherein the ligand is dexamethasone and 

■ 

the peptide comprises an LXXLL (SEQ ID NO:18) motif. 

77. The method of claim 70, wherein the peptide is a fragment of a TIF2 

» 

protein. 

* 

78. A method of designing a modulator that selectively modulates the 
activity of a GRa polypeptide the method comprising: 

(a) obtaining a crystalline form of a GRa ligand binding domain 
polypeptide; 

(b) determining the three-dimensional structure of the crystalline form of 
the GRa ligand binding domain polypeptide; and 

(c) synthesizing a modulator based on the three-dimensional structure 
of the crystalline form of the GRa ligand binding domain polypeptide, 
whereby a modulator that selectively modulates the activity of a 
GRa polypeptide is designed. 

79. The method of claim 78, wherein the method further comprises 
contacting a GRa ligand binding domain polypeptide with the potential modulator; 
and assaying the GRa ligand binding domain polypeptide for binding of the 
potential modulator, for a change in activity of the GRa ligand binding domain 
polypeptide, or both. 

80. The method of claim 78, wherein the crystalline form is a hexagonal 

form. 

81. The method of claim 80, wherein the crystals are such that the 
three-dimensional structure of the crystallized GRa ligand binding domain 
polypeptide can be determined to a resolution of about 2.8 A or better. 
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qp J*' ^ ° f C ' aim 78 ' tHe CrySta " ine form a 

" 9and b,ndm 9 dom ain with a ligand and a peptide. 



83. The method of claim 82, wherein the ligand i 



is a steroid. 



84. The method of claim 83, wherein the steroidi 



is dexamethasone. 



10 peoJi!' , meth ° d ° f C ' aim 82> Wh6rein the ■» a steroid and the 

i o peptide is a fragment of a co-activator. 

oent J 6 ' , meth ° d ° f C ' alm 82 * Wh6rein the ,igand is a ceroid and the 
peptide is a fragment of a co-repressor. 



1 5 tho ! 7 , meth ° d ° f da, ' m 82 ' Wherein the «9and is dexamethasone 



the peptide comprises an LXXLL (SEQ ID NO:18) 



and 



motif. 



protein 

20 



88. The method of claim 82, wherein the peptide is a fragment of a TIF2 



25 



30 



89 The meth0(J Qf c|a . m ?8 where . n ^ threeKjjmens . onai structure 

he crystalline form of the GRa ligand binding domain polypeptide is described by 
the atomic coordinates shown in Table 4. 

GR Jnd k- w- m ?° d ° f SCreen,n9 3 P,UraHty ° f for a mod "'ator of a 

" 9and b,nd,n 9 domai " Polypeptide, the method comprising: 

(a) providing a library of test samples; 

(b) contacting a GR ligand binding domain polypeptide with each test 
sample; 

(c) detecting an interaction between a test sample and the GR ligand 
binding domain polypeptide; 

(d) identifying a test sample that interacts with the GR ligand binding 
domain polypeptide; and 



10 



WO 03/015692 PCT/US02/22648 

-268- 

(e) isolating a test sample that interacts with the GR ligand binding 
domain polypeptide, whereby a plurality of compounds is screened 
for a modulator of a GR ligand binding domain polypeptide. 

91. The method of claim 90, wherein the test samples are bound to a 
substrate. 



92. The method of claim 90, wherein the test samples are synthesized 
directly on a substrate. 



93. A method for identifying a GR modulator, the method comprising: 

(a) providing atomic coordinates of a GR ligand binding domain to a 
computerized modeling system;and 

(b) modeling ligands that fit spatially into the binding pocket of the GR 
1 5 ligand binding domain to thereby identify a GR modulator, whereby a 

GR modulator is identified. 

94. The method of claim 93, wherein the method further comprises 
identifying in an assay for GR-mediated activity a modeled ligand that increases or 

20 decreases the activity of the GR. 

95. The method of claim 93, wherein the atomic coordinates are the 
atomic coordinates shown in Table 4. 

< 

96. A method of identifying modulator that selectively modulates the 
25 activity of a GRcc polypeptide compared to other GR polypeptides, the method 

comprising: 

(a) providing atomic coordinates of a GRa ligand binding domain to a 
computerized modeling system; and 

(b) modeling a ligand that fits into the binding pocket of a GRa ligand 
30 binding domain and that interacts with conformationally constrained 

residues of a GRa conserved among GR subtypes, whereby a 
modulator that selectively modulates the activity of a GRa 

« 

polypeptide compared to other polypeptides is identified. 
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97. The method of claim 96, wherein the method further comprises 
■dentrtymg in a biological assay for GR activity a modeled ligand that selectively 
binds to GRa and increases or decreases the activity of said GRa. 

98. The method of claim 96, wherein the atomic coordinates are the 
atomic coordinates shown in Table 4. 

A method of designing a modulator of a GR polypeptide, the method 



(b) 



(d) 



(e) 



99. 

0 comprising: 

(a) selecting a candidate GR ligand; 

determining which amino acid or amino acids of a GR polypeptide 
interact with the ligand using a three-dimensional model of a 
crystallized protein comprising a GRa LBD; 
(c) identifying in a biological assay for GR activity a degree to which the 
ligand modulates the activity of the GR polypeptide; 
selecting a chemical modification of the ligand wherein the 
interaction between the amino acids of the GR polypeptide and the 
ligand is predicted to be modulated by the chemical modification; 
synthesizing a chemical compound with the selected chemical 
modification to form a modified ligand; 
(0 contacting the modified ligand with the GR polypeptide; 

identifying in a biological assay for GR activity a degree to which the 
modified ligand modulates the biological activity of the GR 
polypeptide; and 

* 

comparing the biological activity of the GR polypeptide in the 
presence of modified ligand with the biological activity of the GR 
polypeptide in the presence of the unmodified ligand, whereby a 
modulator of a GR polypeptide is designed. 

100. The method of claim 99, wherein the GR polypeptide is a GRa 
polypeptide. 



(9) 



(h) 



WO 03/015692 PCT7US02/22648 

-270- 

101. The method of claim 99, wherein the three-dimensional model of a 
crystallized protein is a GRa ligand binding domain with a ligand and a peptide. 

102. The method of claim 101 , wherein the ligand is a steroid. 

5 

1 03. The method of claim 1 01 , wherein the steroid is dexamethasone. 

• » 

104. The method of claim 101, wherein the ligand is a steroid and the 
peptide is a fragment of a co-activator. 

10 

105. The method of claim 101, wherein the ligand is a steroid and the 
peptide is a fragment of a co-repressor. 

106. The method of claim 101, wherein the ligand is dexamethasone and 
1 5 the peptide comprises an LXXLL (SEQ ID NO:18) motif. 

107. The method of claim 101, wherein the peptide is a fragment of a 
TIF2 protein. 

20 

108. The method of claim 99, wherein the three-dimensional model is 
represented by the three dimensional coordinates shown in Table 4. 

109. The method of claim 99, wherein the method further comprises 
25 repeating steps (a) through (f), if the biological activity of the GR polypeptide in the 

presence of the modified ligand varies from the biological activity of the GR 
polypeptide in the presence of the unmodified ligand. 

110. An assay method for identifying a compound that inhibits binding of 
30 a ligand to a GR polypeptide, the assay method comprising: 

(a) designing a test inhibitor compound based on the three dimensional 
atomic coordinates of GR; 

(b) incubating a GR polypeptide with a ligand in the presence of a test 



(c) 
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inhibitor compound; 

determining an amount of ligand that is bound to the GR 
polypeptide, wherein decreased binding of ligand to the GR protein 
in the presence of the test inhibitor compound relative to binding of 
ligand in the absence of the test inhibitor compound is indicative of 
inhibition; and 



(d) 



identifying the test compound as an inhibitor of ligand binding if 
decreased ligand binding is observed, whereby a compound that 
inhibits binding of a ligand to a GR polypeptide is identified. 



111. The method of claim 110, wherein the ligand is a steroid. 

1 12. The method of claim 1 11 . wherein the steroid is dexamethasone. 

1 1 3. The method of claim 1 1 0, wherein the three dimensional coordinates 
are the three dimensional coordinates shown in Table 4. 

114. A method of identifying a NR modulator that selectively modulates 
the biological activity of one NR compared to GRa, the method comprising: 

(a) providing an atomic structure coordinate set describing a GRq 
ligand binding domain structure and at least one other atomic 
structure coordinate set describing a NR ligand binding domain, 
each ligand binding domain comprising a ligand binding site; 

(b) comparing ihe atomic structure coordinate sets to identify at least 
one diference between the sets; 

(c) designing a candidate ligand predicted to interact with the difference 
of step (b); 

(d) synthesizing the candidate ligand; and 

(e) testing the synthesized candidate ligand for an ability to selectively 
modulate a NR as compared to GRa, whereby a NR modulator that 
selectively modulates the biological activity NR compared to GRa is 
identified. 
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115. The method of claim 114, wherein the GRa atomic structure 
coordinate set is the atomic structure coordinate set shown in Table 4. 

116. The method of claim 1 14, wherein the NR is selected from the group 
consisting of MR, PR, AR, GRP and isoforms thereof that have ligands that also 
bind GRa. 
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772 S G QlM E Y[F)A|P 
758 NSRHLYgA 
399 - PGKLLiA 
351 - PGKHg|A 



D [L I I N|E QrI m T - l |p CMYD|q|CKHBLYVS 3ELHR L| Q V s(7 
P L VF NEE K M H - Q S A MY ELCQGHHQI SLQFVR L Q LTF 



P 
P 
P 



. _ Y sBc L tEw qipqefvkl 

DLVFNE YRMHI- KS R|Y SQCVR|RBLSQEFGWL 

nllldrnqg k c]v e gMv EOF d mE)l atssrfrhhnlq 
p|l v i» p|r d e g k c|v e gBl e(Sf p mEIi* attsrfrel 



beta-3 beta-4 helix- 6 



helix- 7 



QVS 
Q I T 



Q 
P 
G 

KLQH 



661 EEYLCM KTLLLl L S - |S V P[ K DGLK S lQELFDEIRMTYIKELG 

867 EEYTIMKVLIi L Ii S TIP K DGLKSQAAFEEHRTNYIKELR 

816 E E F LCUKVLLLlL N 

802 QEFLCHRAIiLLFS 



LEGLRS 



QTQFEEMRS S YIRELI 



IIP VPGLKNQ RFFPE LRHNY IKELD 



443 EEFVCLKSII LIiNS GvI y TFI.S S TjL K S L EEKDEZBRVLDKITDTLI 
395 REYLC VRAMILLNS| 3MY PLV- T |A T QDjADS 3RKLAHLLNAVTDALV 
helix-8 beta-5 helix- 9 



699 kai vkrIe g n s s q |n w q rf y q l t k l i> d s m h e v ve nBl NgjglF q^FlI p - 

It 5 , ? N SG Q SW QR?YQI'TRLLDSMHDLVSDLLEFCFYTFRE S 

S 



854 KAI GL Rl Q K G|V V S SSQRFYQLTKLLDNLHDLVK QfLlH LlYjfC^ aujK A w 
840 R 1 I[ACK R K N PT]s CSRRFYQLTRLLDSVQPIAR EfDH OfFffrlF D|£)LIR 



488 HLMAKAjGLT 
439 WVIAKlsG I S 



L QQQHQRLAQLLLILSHIRHMSN K(g)M E 
SQQQSMRLANLLMLLSHVRHASNRjHE 



ma 



0B 



LM|FI Q 



Y S 

lnI 



KCKN 



KCKM 



helix-10 



GR 
MR 
PR 
AR 



743 K T M S{l}E(g]p |E M|§AE IITNQIP k! Y S N G N [l K K| L LFHQK 
950 HALKVEFP AMLVEI I 3D QLPKVESGNAKPLYFHRK 
899 RALS@E[f)P 



885 HMVSVP(f)P 
VVPfLY 



E M(g) SEVIAAQLPKILAlG MiVKP 
HM|AE lg]S VQ VP K ILSGKVKP 



.|PAH|R _ . _ _ _ _ _ _ , _ „ _ w 

484 v v p |v y p]l lleh l n]a h vI l RGCKSSITGSECSPAEDSKSREGSQ 

helix-AF beta- 6 



LLFHKK 
IYFHTQ 



DLZiLEM LjP A H[R L EAPTSRGGAS VEETOQSBLATAGST 



575 SSHSLQKYYITGEAEGFPATV 
526 NPQSQ 
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1747 
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F749 



N564 



C736 



M560 



W600 



\. I 

v c 



G567 



L566 



Q570 
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" ^oh «v T739 



;;> sV 
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. Y735 



L732 
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Apolito, Christopher J. 
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<120> CRYSTALLIZED GLUCOCORTICOID .RECEPTOR LIGAND BINDING DOMAIN POLYPEPTIDE AND SCREENING 
METHODS EMPLOYING SAME 

<130> Docket No. PU4523 
<140> 

■ 

<141> 

<150> 60/305,902 ' 
<151> 2001-07-17 

<160> 41 

<170> Patentln version 3.1 

<210> 1 

i 

<211> 2334 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1)..{2334) 

<223> 

<400> 1 

atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 46 

Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
15 10 15 
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age ttt cct gga gca aat. ata att ggt aat aaa atg tec gec att tct 
Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala He Ser 
305 310 315 320 



96 



144 



192 



240 



288 



336 



tit I ? ? !? Cag 939 agg gga gat *tg at 9 ?ac ttc tat aaa acc 
Ser val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 

20 25 30 

eta aga gga gga gc t act gtg aag gtt tct gcg tct tea ccc tea ctg 
Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 
Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat ctg tec aaa 
Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

??! 2*? f tC tCa atg " a Ctg tat atg M» aca *ca aaa 
Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 go 95 

gtg. atg gga aat gac ctg gga ttc cca cag cag ggc caa ate age ctt 
Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 105 " no 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 384 
Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 120 i 2 5 

etc aat agg teg acc agt gtt cca gag aac ccc aag agt tea gca tec 432 
Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 

act get gtg tct get gec ccc aca gag aag gag ttt cca aaa act cac 480 
Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 us 160 

tct gat gta tct tea gaa cag caa cat ttg aag ggc cag act ggc acc 528 
Ser Asp val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 175 

aac ggt ggc aat gtg aaa ttg tat acc aca gac caa age acc ttt gac 576 
Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
**0 185 190 

att ttg cag gat ttg gag ttt tct tct ggg tec cca ggt aaa gag acg 
He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 " 205 



624 



120 



aat gag agt cct tgg aga tea gac ctg ttg ata gat gaa aac tgt ttg 672 
Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 

ctt tct cct ctg gcg gga gaa gac gat tea ttc ctt ttg gaa gga aac 
Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 240 

teg aat gag gac tgc aag cct etc att tta ccg gac act aaa ccc aaa 
Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 

245 250 * 255 

att aag gat aat gga gat ctg gtt ttg tea age ccc agt aat gta aca 
He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
2fi 0 265 270 



768 



816 



ctg ccc caa gtg aaa aca gaa aaa gaa gat ttc ate gaa etc tgc acc 864 
Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
. 275 280 285 

cct ggg gta att aag caa gag aaa ctg ggc aca gtt tac tgt cag gca 912 
Pro Gly val lie Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



960 
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gtt cat ggt gtg agt acc tct gga gga cag atg tac cac tat gac atg 
Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

325 330 335 

aat aca gca tec ctt tct caa cag cag gat cag aag cct att ttt aat 
Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 

gtc att cca cca att ccc gtt ggt tec gaa aat tgg aat agg tgc caa 
Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 

gga tct gga gat gac aac ttg act tct ctg ggg act ctg aac ttc cct 
Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 * 380 

ggt cga aca gtt ttt tct aat ggc tat tea age ccc age atg aga cca 
Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 " 395 400 

gat gta age tct cct cca tec age tec tea aca gca aca aca gga cca 
Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 415 

cct ccc aaa etc tgc ctg gtg tgc tct gat gaa get tea gga tgt cat 
Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 

tat gga gtc tta act tgt gga age tgt aaa gtt ttc ttc aaa aga gca 
Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 

gtg gaa gga cag cac aat tac eta tgt get gga agg aat gat tgc ate 
Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 

ate gat aaa att cga aga aaa aac tgc cca gca tgc cgc tat cga aaa 
He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 

tgt ctt cag get gga atg aac ctg gaa get cga aaa aca aag aaa aaa 
Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 490 ' 495 

ata aaa gga att cag cag gee act aca gga gtc tea caa gaa acc tct 
He Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 

gaa aat cct ggt aac aaa aca ata gtt cct gca acg tta cca caa etc 
Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

acc cct acc ctg gtg tea ctg ttg gag gtt att gaa cct gaa gtg tta 
Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro Glu Val Leu 
530 535 540 

tat gca gga tat gat age tct gtt cca gac tea act tgg agg ate atg 
Tyr Ala Gly Tyr Asp Ser Ser Val Pro A5p Ser Thr Trp Arg He Met 
545 550 555 560 

act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 
Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 



tgg gca aag gca 
Trp Ala Lys Ala 
580 

atg acc eta ctg 
Met Thr Leu Leu 
595 

ggg tgg aga tea 
Gly Trp Arg Ser 



ata cca ggt ttc 
He Pro Gly Phe 



cag tac tec tgg 
Gin Tyr Ser Trp 
600 

tat aga caa tea 
Tyr Arg Gin Ser 



agg aac tta cac 
Arg Asn Leu His 
585 

atg ttt ctt atg 
Met Phe Leu Met 



agt gca aac ctg 
Ser Ala Asn Leu 



ctg gat gac caa 

Leu Asp Asp Gin 
590 

.gca ttt get ctg 

Ala Phe Ala Leu 
605 

ctg tgt ttt get 

Leu Cys Phe Ala 
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1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 
1440 



1488 



1536 



1584 



1632 



1680 



1728 



1776 



1824 



1872 
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610 615 620 

Pro III feu »o f?' t at ?? 9 " 9 " a atg act cta c ~ t«c atg tac 
Pro Asp Leu lie lie Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 

" 630 635 " 640 

til rf 3 r g l ° aC at9 Ct9 tat gtt tCC tct Sag tta cac agg ctt 1968 
Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 

6« 6 50 655 

G?n vll ser Tvr ??" l^ 9 S" ° tC t9t " 9 *** acC tta "9 «t etc 2016 
Gin Val ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 

660 665 670 

Ser ttt III til ff.I I!! 99t . ?!.? f 39 f 9C C J> 9ag cta ttt gat gaa 2064 

Leu 

685 

ill lit ^ SH! ^ f. a ! 9 f 9 f ta ?? a aaa ?« att gtc aag agg 2112 

*» J* d 

700 



1920 ^ 



qok c ..I; ; * yau yyt ctg aag a y c caa gag cta ttt gat gaa 

Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 geo 

Tlo Z = ~ * tg 55° tac atc aaa cca gga aaa gec att gtc aaa aaa 

He Arg M et Thr Tyr lie Lys Glu Leu Gly Lys Ala lie Val Lys Ira 



?1« r?f c CC f 9 ° °f 9 330 t99 Cag cgg ttt: tat caa ct <? «a aaa 2160 

Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 

710 715 720 



705 



t1?. i ! 9 ? c atg ° at gaa gtg gtt gaa aat ctc ctt aac tat tgc 
Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 

725 730 735 



2208 



Ph« It It 9 gat aaq aCC atg agt att gaa «c ccc gag atg 2256 

Phe Gin Thr Phe Leu Asp Lys Thr Met Ser lie Glu Phe Pro Glu Me? 

740 745 750 

tta get gaa atc atc acc aat cag ata cca aaa tat tea aat gga aat 2304 
Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn 11* Asn 
755 760 765 

atc aaa aaa ctt ctg ttt cat caa aag tga 2 « 4 / 

He Lys Lys Leu Leu Phe His Gin Lys 
770 775 

<210> 2 

<211> 777 

<212> PRT 

<213> Homo sapiens 



<400> 2 

Met Asp ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
i 5 10 



15 



Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

Ala val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 6 o 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 
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85 90 95 



Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 1 105 HO 



Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 120 125 



Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 



Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 160 



Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 175 



Asn Gly Gly Ran Val Lys Leu Tyx Thx Thr Asp Gin Ser Thr Phe Asp 
180 185 190 



He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 



Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 



Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 240 



Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 

245 250 255 



He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260. 1 265 270 



Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 . 



Pro Gly Val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala He Ser 
305 310 315 320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

325 330 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 



Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 
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Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 



390 



395 



400 



Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

4 °5 410 415 

Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 



Tyr Gly val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 



445 



Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys lie 
45 ° 455 460 



lie Asp Lys lie Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 Anc ~ * 



475 



480 



Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 490 495 

He Lys Gly lie Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 

Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

Thr Pro Thr Leu Val Ser Leu Leu Glu val He Glu Pro Glu Val Leu 
53 ° 535 540 



Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arq He Met 
545 550 555 560 

Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val lie Ala Ala Val Lys 

565 570 575 

Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 



Met Thr Leu Leu Gin Tyr Ser Trp Met Phe Leu Met Ala Phe Ala Leu 
595 600 6 05 

Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
610 615 620 



Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 



635 



640 



Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 

650 655 

Gin Val ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 . 670 



Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 685 
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He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala lie Val Lys Arg 
690 695 700 



Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 



Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 

725 730 735 



Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 



Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 760 765 



He Lys Lys Leu Leu Phe His Gin Lys 
770 775 



<210> 3 

<211> 2334 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1)..(2334) 

<223> 



<400> 3 

atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 48 
Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
1 5 10 15 

agt gtg ctt get cag gag agg gga gat gtg atg gac ttc tat aaa ace 96 
Ser Val Leu Ala Gin Glu Arg Gly Asp val Met Asp Phe Tyr Lys Thr 
20 25 30 

eta aga gga gga get act gtg aag gtt tct gcg tct tea ccc tea etg 144 
Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 192 
Ala val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat etg tec aaa 240 
Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 10 75 80 

gca gtt tea etc tea atg gga etg tat atg gga gag aca gaa aca aaa 288 
Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

gtg atg gga aat gac etg gga ttc cca cag cag ggc caa ate age ctt 336 
Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 * 105 HO 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 384 
Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
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"5 120 125 

LPn V* t" c° 9 tu° a9t 9 " CCa ga 9 aac c <* aag agt tea gca tec 432 
Leu As.n Arg Ser Thr Ser Val Pro Glu Asn Pro Lys sir Ser Ma Ser 

135 

?hr !f! r? ??' 9 f C CCC aCa 9a « aa « « a 9 ttt cca aaa act cac 480 
Thr Ala val Ser Ala Ala Pro. Thr Glu Lys Glu Phe Pro Lys Thr His 

1 5 0 



!55 160 



tct gat gta tct tea gaa cag caa cat ttg aag g gc cag act gqc acc 
Ser Asp val Ser Ser Glu Gin Gin His Leu Lys Cly Gin Thr lly Thr 

170 175 
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528 



aac ggt ggc aat gtg aaa ttg tat acc aca gac caa age acc ttt oac 576 
Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 

r^t fl 9 ? at tt9 gag ltt tct tct 999 tec cca ggt aaa gag acg 624 
He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 



III rt 9 t 9 I tgg aga tCa gac ctg ttg ata gat gaa aac tgt ttg 672 
fcsn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 

ctt tct cct ctg gcg gga gaa gac gat tea ttc ctt ttg gaa gga aac 720 
Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 240 

Ser Stn lu* f° Ct ° att CC9 gd ° * Ct ada CCC aaa ™* 

Ser Asn Glu Asp Cys Lys Pro Leu lie Leu Pro Asp Thr Lys Pro Lys 

245 250 



816 



864 



912 



960 



att aag gat aat gga gat ctg gtt ttg tea age ccc agt aat gta aca 
He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 

ctg ccc caa gtg aaa aca gaa aaa gaa gat ttc ate gaa etc tgc acc 
Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe lie Glu Leu Cys Thr 
275 280 285 

r? 9 r? 339 ° aa 939 393 ° tg g9C aca gtt tac tgt cag gca 

Pro Gly Val lie Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 

290 295 300 

age ttt cct gga gca aat ata att ggt aat aaa atg tct gec att tct 
Ser Phe Pro Gly Ala Asn lie lie Gly Asn Lys Met Ser Ala lie Ser 
305 310 315 320 

w 3t 2?* agt aCC tCt " a " a cag atg tac cac tat 9*0 a tg 1008 

Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

^25 330 335 

aat aca gca tec ctt tct caa cag cag gat cag aag cct att ttt aat 
Asn Thr Aia Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 

gtc att cca cca att ccc gtt ggt tec gaa aat tgg aat agg tgc caa 
val He Pro Pro lie Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 

gga tct gga gat gac aac ttg act tct ctg ggg act ctg aac ttc cct 1152 
Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 3 8 o 

ggt cga aca gtt ttt tct aat ggc tat tea age ccc age atg aga cca 
Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 

gat gta age tct cct cca tec age tec tea aca gca aca aca gga cca 
Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 415 



1056 



1104 



1200 



cct ccc aaa etc tgc ctg gtg tgc tct gat gaa get tea gga tgt cat 1296 
Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 



1248 
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420 425 430 

tat gga gtc tta act tgt gga age tgt aaa gtt ttc ttc aaa aga gca 1344 

Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 

gtg gaa gga cag cac aat tac eta tgt get gga agg aat gat tgc ate 1392 

Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys lie 
450 455 460 

ate gat aaa att cga aga aaa aac tgc cca gca tgc cgc tat cga aaa 1440 

He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 

465 470 475 " 480 

tgt ctt cag get gga atg aac ctg gaa get cga aaa aca aag aaa aaa 1488 

Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 490 495 

ata aaa gga att cag cag gee act aca gga gtc tea caa gaa acc tct 1536 

lie Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 

500 505 510 

gaa aat cct ggt aac aaa aca ata gtt cct gca acg tta cca caa etc 1584 

Glu Asn Pro Gly Asn Lys Thi He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

acc cct acc ctg gtg tea ctg ttg gag gtt att gaa cct gaa gtg tta 1632 

Thr Pro Thr Leu Val Ser Leu Leu Glu val He Glu Pro Glu Val Leu 
530 535 540 

tat gca gga tat gat age tct gtt cca gac tea act tgg agg ate atg 1680 

Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 

545 550 555 " 560 



act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 1728 
Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val lie Ala Ala Val Lys 

565 570 575 

tgg gca aag gca ata cca ggt ttc agg aac tta cac ctg gat gac caa 1776 
Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 

atg acc eta ctg cag tac tec tgg atg tec ctt atg gca ttt get ctg 1824 
Met Thr Leu Leu Gin Tyr Ser Trp Met Ser Leu Met Ala Phe Ala Leu 
595 600 605 

ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 1872 
Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
610 615 620 

cct gat ctg att att aat gag cag aga atg act eta ccc tgc atg tac 1920 
Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 635 640 

gac caa tgt aaa cac atg ctg tat gtt tec tct gag tta cac agg ctt 1968 
Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 

645 650 655 

cag gta tct tat gaa gag tat etc tgt atg aaa acc tta ctg ctt etc 2016 
Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

tct tea gtt cct aag gac ggt ctg aag age caa gag eta ttt gat gaa 2064 
Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 " 685 

att aga atg acc tac ate aaa gag eta gga aaa gee att gtc aag agg 2112 
He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
690 695 " 700 

gaa gga aac tec age cag aac tgg cag egg ttt tat caa ctg aca aaa 2160 
Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 
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etc ttg gat tct atg cat gaa gtg gtt gaa aat etc ctt aac tat tgc 2208 
Leu Leu Asp ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 

725 730 735 

III rf 3 It* IV tt9 93t aag aCC atg agt att ? aa ccc gag atg 2256 
Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 

740 745 750 

ill l C S ti° ^° 3 " aat Cag 3ta CCa aaa tat tca aat W aat 2304 

Leu Ala Glu He lie Thr Asn Gin lie Pro Lys Tyr Ser Asn Gly Asn 
755 760 765 

ate aaa aaa ctt ctg ttt cat caa aag tga 2334 
He Lys Lys Leu Leu Phe His Gin Lys 
770 775 



<210> 


4 


<211> 


777 


<212> 


PRT 


<213> 


Homo 


<400> 


4 



Met Asp ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
15 10 is 



Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 

Leu Arg Gly Gly Ala Thr val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 eo 



Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 105 no 

Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 120 125 



Leu Asn Arg Ser Thr Ser val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 



Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 160 

Ser Asp val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 175 

Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 
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He- Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro GLy Lys Glu Thr 
195 200 205 



Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 



Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 * 235 240 



Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 

245 250 255 



He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 ' 265 270 



Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 



Pro Gly val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala He Ser 
305 310 " 315 320 



Val His Gly Val Ser Thr Ser Gly. Gly Gin Met Tyr His Tyr Asp Met 

325 330 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 



Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 



Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 



Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 415 



Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 * 425 430 



Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 ~ 440 445 



Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 



He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 * 470 475 480 



Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 490 495 
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He Lys Gly lie Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
5 °0 505 510 

Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 



Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro Glu Val Leu 
53 ° 535 540 



Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 

545 550 555 560 

Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 



Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 



Met Thr Leu Leu Gin Tyr Ser Trp Met Ser Leu Met Ala Phe Ala Leu 
595 600 605 



Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
6l ° 615 620 



Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 635 " 640 



Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu His Arg Leu 

645 650 655 



Gin val ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 



Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 685 



He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
690 695 700 



Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 



Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 

725 730 735 



Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 



Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 760 765 



He Lys Lys Leu Leu Phe His Gin Lys 
770 775 
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<220> 



<221> CDS 



<222> (1)..(2334) 



<223> 



<400> 5 

atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 

Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 

1 .5 10 15 

agt gtg ctt get cag gag agg gga gat gtg atg gac ttc tat aaa acc 
Ser val Leu Ala Gin Glu Arg Gly Asp val Met Asp Phe Tyr Lys Thr 
20 25 30 

eta aga gga gga get act gtg aag gtt tct gcg tct tea ccc tea ctg 
Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 
Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat ctg tec aaa 
Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

gca gtt tea etc tea atg gga ctg tat atg gga gag aca gaa aea aaa 
Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

gtg atg gga aat gac ctg gga ttc cca cag cag ggc caa ate age ctt 
Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 105 110 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 
Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 120 125 

etc aat agg teg acc agt gtt cca gag aac ccc aag agt tea gca tec 
Leu Asn Atg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 

act get gtg tct get gec ccc aca gag aag gag ttt cca aaa act cac 
Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 160 

tct gat gta tct tea gaa cag caa cat ttg aag ggc cag act ggc acc 
Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 175 

aac ggt ggc aat gtg aaa ttg tat acc aca gac caa age acc ttt gac 
Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 " 185 190 

att ttg cag gat ttg gag ttt tct tct ggg tec cca ggt aaa gag acg 
He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 

aat gag agt cct tgg aga tea gac ctg ttg ata gat gaa aac tgt ttg 
Asn Glu Ser Pro .Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 

ctt tct cct ctg gcg gga gaa gac gat tea ttc ctt ttg gaa gga aac 



48 



96 



144 



192 



240 



288 



336 



3B4 



432 



480 



528 



576 



62 4 



672 



720 
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Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
"•3 230 ~ - - 



235 240 



teg aat gag gac tgc aag cct etc att tta ccg gac act aaa ccc aaa 
Ser Asn Glu Asp Cys Lys Pro Leu lie Leu Pro Asp Thr Lys Pro Lys 

245 250 . 255 

att aag gat aat gga gat ctg gtt ttg tea age ccc agt aat gta aca 
lie Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 

ctg ccc caa gtg aaa aca gaa aaa gaa gat ttc ate gaa etc tgc ace 
Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe lie Glu Leu Cys Thr 
275 280 2B5 

cct ggg gta att aag caa gag aaa ctg ggc aca gtt tac tgt cag gca 
Pro Gly val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 

age ttt cct gga gca aat ata att ggt aat aaa atg tct gec att tct 
Ser Phe Pro Gly Ala Asn lie He Gly Asn Lys Met Ser Ala lie Ser 
305 31 <> 315 320 

2?* agt dCC tCt gga gga cag at * tac tat gac atg 
val His Gly val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

325 330 335 

dat *f a ?f a tC ° Ctt tct caa cag ca 9 9 at ca 9 aa 9 cct att ttt aat 
Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 

3 ™ 345 350 

gtc att cca cca att ccc gtt ggt tec gaa aat tgg aat agg tgc caa 
val lie Pro Pro He Pro val Gly Ser Glu Asn Trp Asn Arg Cys Gin 



355 36o 



365 



gga tct gga gat gac aac ttg act tct ctg ggg act ctg aac ttc cct 
Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 

ggt cga aca gtt ttt tct aat ggc tat tea age ccc age atg aga cca 
Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 ' 390 395 400 

gat gta age tct cct cca tec age tec tea aca gca aca aca gga cca 
Asp val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 4 15 

cct ccc aaa etc tgc ctg gtg tgc tct gat gaa get tea gga tgt cat 
Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
42 0 425 430 

tat gga gtc tta act tgt gga age tgt aaa gtt ttc ttc aaa aga gca 
Tyr Gly val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys-fcrg Ala 
435 4 4 o 



ate gat aaa att cga aga aaa aac tgq cca gca tgc cgc tat" cga aaa 
lie Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arq Lys 
465 470 



475 480 



tgt ctt cag get gga atg aac ctg gaa get cga aaa aca aag aaa aaa 
Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 490 495 



gaa aat cct ggt aac aaa aca ata gtt cct gca acg tta cca caa etc 
Glu Asn Pro Gly Asn Lys Thr lie Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



gtg gaa gga cag cac aat tac eta tgt get gga agg aat gat tgc ate 1392 
Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys lie 
450 455 460 



1440 



1488 



ata aaa gga att cag cag gec act aca gga gtc tea caa gaa ace tct 1536 
He Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 5Q5 510 



1584 



ace cct acc ctg gtg tea ctg ttg gag gtt att gaa cct gaa gtg tta 1632 
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Thr Pro Thr Leu Val Ser Leu Leu Glu Val lie Glu Pro Glu Val Leu 
530 535 540 

tat gca gga tat gat age tct gtt cca gac tea act tgg agg ate atg 
Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg lie Met 
545 550 555 560 



1680 



act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 
Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 



1728 



tgg gca aag gca ata cca ggt ttc agg aac tta cac ctg gat gac caa 
Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 

atg acc eta ctg cag tac tec tgg atg gac ctt atg gca ttt get ctg 
Met Thr Leu Leu Gin Tyr Ser Trp Met Asp Leu Met Ala Phe Ala Leu 
595 600 605 



1776 



1824 



ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 
Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
610 615 620 

cct gat ctg att att aat gag cag aga atg act eta ccc tgc atg tac 
Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 635 640 



1872 



1920 



gac caa tgt aaa cac atg ctg tat gtt tec tct gag tta cac agg ctt 
Asp Gin Cys Lys His Met Leu Tyr val Ser Ser Glu Leu His Arg Leu 

645 650 655 

cag gta tct tat gaa gag tat etc tgt atg aaa acc tta ctg ctt etc 
Gin val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

tct tea gtt cct aag gac ggt ctg aag age caa gag eta ttt gat gaa 
Ser Ser val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 685 

att aga atg acc tac ate aaa gag eta gga aaa gee att gtc aag agg 
He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
690 695 700 

gaa gga aac tec age cag aac tgg cag egg ttt tat caa ctg aca aaa 
Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 



1968 



2016 



2064 



2112 



2160 



etc ttg gat tct atg cat gaa gtg gtt gaa aat etc ctt aac tat tgc 
Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 

725 730 735 



2208 



ttc caa aca ttt ttg gat aag acc atg agt att gaa ttc ccc gag atg 
Phe Gin Thx Phe Leu Asp Lys Thr Wet Sex He Glu Phe Pro Glu Met 
740 745 750 

tta get gaa ate ate acc aat cag ata cca aaa tat tea aat gga aat 
Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 760 765 

ate aaa aaa ctt ctg ttt cat caa aag tga 
He Lys Lys Leu Leu Phe His Gin Lys 
770 775 



2256 



2304 



2334 



<210> 6 



<211> 777 



<212> PRT 



<213> Homo sapiens 



<400> 6 
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Met Asp Set Lys Glu Set Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
1 5 10 15 

Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 * 80 

Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin He Ser Leu 
100 105 110 



Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser lie Ala Asn 
115 120 125 

Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
!30 135 140 



Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 ' i 6 o 



Ser Asp val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 175 

* « 

Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 



He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 



Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 



Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 



240 



Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 

245 250 255 



He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 



Leu Pro Gin Val Lys Thr Glu Lys Glu Asp phe He Glu Leu Cys Thr 
275 280 285 



Pro Gly Val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 
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Ser Phe Pro Gly Ala Asn He lie Gly Asn Lys Met Ser Ala He Set 
305 310 315 320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

325 330 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 



Val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 



Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 



Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 415 



Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 



Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 • 440 445 



Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 



He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 



Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 490 495 



He Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 



Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 



Thr Pro Thr Leu Val Ser Leu Leu Glu Val lie Glu Pro Glu Val Leu 
530 535 540 



Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg He Met 
545 * 550 555 560 



Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 



Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 



Met Thr Leu Leu Gin Tyr Ser Trp Met Asp Leu Met Ala Phe Ala Leu 
595 600 605 
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Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ma 
61 ° 615 620 

Pro Asp Leu He He Asn Glu Gin Arg Met Thr Leu Pro Cys Met Tyr 
625 630 635 640 

Asp Gin Cys Lys His Met Leu Tyr val Ser Ser Glu Leu His Arg Leu 

645 650 655 

Gin Val ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 , 670 

Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu 
675 680 685 

He Arg Met Thr Tyr He Lys Glu Leu Gly Lys Ala He Val Lys Arg 
69 0 695 700 

Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr Gin Leu Thr Lys 
705 7 10 715 720 

Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn Tyr Cys 

7 *S 730 735 

Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 



Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn Gly Asn 
755 760 765 



He Lys Lys Leu Leu Phe His Gin Lys 
770 775 
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<222> (1) . . (2334) 

<223> n - a or c or g or t/u 



<400> 7 
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atg gac tec aaa gaa tea tta act cct ggt aga gaa gaa aac ccc age 
Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
15 10 15 

agt gtg ctt get cag gag agg gga gat gtg atg gac ttc tat aaa ace 
Ser Val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 



PCT/US02/22r>48 



48 



96 



eta aga gga gga get act gtg aag gtt tct gcg tct tea ccc tea ctg 
Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
35 40 45 

get gtc get tct caa tea gac tec aag cag cga aga ctt ttg gtt gat 
Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 * 60 

ttt cca aaa ggc tea gta age aat gcg cag cag cca gat ctg tec aaa 
Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 

70 75 80 



65 



144 



192 



240 



gca gtt tea etc tea atg gga ctg tat atg gga gag aca gaa aca aaa 
Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

gtg atg gga aat gac ctg gga ttc cca cag cag ggc caa ate age ctt 
Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin lie Ser Leu 
100 105 110 

tec teg ggg gaa aca gac tta aag ctt ttg gaa gaa age att gca aac 
Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser He Ala Asn 
115 . 120 125 

etc aat agg teg acc agt gtt cca gag aac ccc aag agt tea gca tec 
Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 



2B8 



336 



384 



432 



act get gtg tct get gec ccc aca gag aag gag ttt cca aaa act cac 
Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 155 160 



480 



tct gat gta tct tea gaa cag caa cat ttg aag ggc cag act ggc acc 
Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 " 175 

aac ggt ggc aat gtg aaa ttg tat acc aca gac caa age acc ttt gac 
Asn Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 

att ttg cag gat ttg gag ttt tct tct ggg tec cca ggt aaa gag acg 
He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 



528 



576 



624 



aat gag agt cct tgg aga tea gac ctg ttg ata gat gaa aac tgt ttg 
Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 



672 



ctt tct cct ctg gcg gga gaa gac gat tea ttc ctt ttg gaa gga aac 
Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 " 240 



720 



teg aat gag gac tgc aag cct etc att tta ccg gac act aaa ccc aaa 
Ser Asn Glu Asp Cys Lys Pro Leu He Leu Pro Asp Thr Lys Pro Lys 

245 250 255 



768 



att aag gat aat gga gat ctg gtt ttg tea age ccc agt aat gta aca 
He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 



816 



ctg ccc caa gtg aaa aca gaa aaa gaa gat ttc ate gaa etc tgc acc 

Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 

cct ggg gta att aag caa gag aaa ctg ggc aca gtt tae tgt cag gca 

Pro Gly val He Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



864 



912 
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age ttt cct gga gca aat ata att ggt aat aaa atg tct gec att tct 
Ser Phe Pro Gly Ala Asn lie He Gly Asn Lys Met Ser Ala lie Ser 
305 310 315 320 

gtt cat ggt gtg agt acc tct gga gga cag atg tac cac tat gac atg 
Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

325 330 335 

aat aca gca tec ctt tct caa cag cag gat cag aag cct att ttt aat 
Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro He Phe Asn 
340 345 350 

gtc att cca cca att ccc gtt ggt tec gaa aat tgg aat agg tgc caa 
Val lie Pro Pro lie Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 3 5 5 

gga tct gga gat gac aac ttg act tct ctg ggg act ctg aac ttc cct 
Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 

ggt cga aca gtt ttt tct aat ggc tat tea age ccc age atg aga cca 
Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
3SS 390 395 400 

gat gta age tct cct cca tec age tec tea aca gca aca aca gga cca 
Asp val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 4 i5 

cct ccc aaa etc tgc ctg gtg tgc tct gat gaa get tea gga tgt cat 
Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 

tat gga gtc tta act tgt gga age tgt aaa gtt ttc ttc aaa aga gca 
Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
43 5 440 445 

gtg gaa gga cag cac aat tac eta tgt get gga agg aat gat tgc ate 
Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
45 0 455 4 6 o 

ate gat aaa att cga aga aaa aac tgc cca gca tgc cgc tat cga aaa 
He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 

tgt ctt cag get gga atg aac ctg gaa get cga aaa aca aag aaa aaa 
Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys- Lys Lys 

485 490 495 

ata aaa gga att cag cag gee act aca gga gtc tea caa gaa acc tct 
lie Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser-.Gln Glu Thr Ser 
500 505 sio 

gaa aat cct ggt aac aaa aca ata gtt cct gca acg tta cca caa etc 
Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 

acc cct acc ctg gtg tea nnn ttg gag nnn att gaa cct gaa gtg tta 
Thr Pro Thr Leu Val Ser Xaa Leu Glu Xaa He Glu Pro Glu Val Leu 
530 535 540 

tat gca gga tat gat age tct nnn cca gac tea act nnn agg ate atg 
Tyr Ala Gly Tyr Asp Ser Ser Xaa Pro Asp Ser Thr Xaa Arg He Met 
545 550 . 555 " 560 

act acg etc aac atg tta gga ggg egg caa gtg att gca gca gtg aaa 
Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 

tgg gca aag gca ata cca ggt ttc agg aac tta cac ctg gat gac caa 
Trp Ala Lys Ala lie Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 5B5 590 

atg acc eta ctg cag tac tec tgg atg nnn ctt atg gca ttt get ctg 
Met Thr Leu Leu Gin Tyr Ser Trp Met Xaa Leu Met Ala Phe Ala Leu 
595 600 6 05 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 



1632 



1680 



1728 



1776 



1824 
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ggg tgg aga tea tat aga caa tea agt gca aac ctg ctg tgt ttt get 
Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
610 615 620 



1872 



cct gat ctg att att aat gag cag aga atg act nnn ccc nnn atg tac 1920 

Pro Asp Leu lie lie Asn Glu Gin Arg Met Thr Xaa Pro Xaa Met Tyr 

625 630 635 640 

gac caa tgt aaa cac atg ctg nnn gtt tec tct gag tta cac agg ctt 1968 

Asp Gin Cys Lys His Met Leu Xaa Val Ser Ser Glu Leu His Arg Leu 

645 650 655 

cag gta tct nnn gaa gag tat etc tgt atg aaa acc tta ctg ctt etc 2016 

Gin Val Ser Xaa Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

tct tea gtt cct aag gac ggt ctg aag age caa gag nnn ttt gat gaa 2064 

Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Xaa Phe Asp Glu 
675 680 685 

att aga nnn acc tac ate aaa gag eta gga aaa gee att nnn aag agg 2112 

lie Arg Xaa Thr Tyr lie Lys Glu Leu Gly Lys Ala lie Xaa Lys Arg 

690 695 700 



gaa gga aac tec age cag aac nnn cag egg ttt tat caa ctg aca aaa 
Glu Gly Asn Ser Ser Gin Asn Xaa Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 



2160 



etc ttg gat tct atg cat gaa gtg gtt gaa aat etc nnn aac tat tgc 2208 

Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Xaa Asn Tyr Cys 

725 730 735 

ttc caa aca ttt nnn gat aag acc atg agt att gaa ttc ccc gag atg 2256 

Phe Gin Thr Phe Xaa Asp Lys Thr Met Ser lie Glu Phe Pro Glu Met 
740 745 750 

tta get gaa ate ate acc aat cag ata cca aaa nnn tea aat gga aat 2304 

Leu Ala Glu He He Thr Asn Gin He Pro Lys Xaa Ser . Asn Gly Asn 
755 760 765 

ate aaa aaa ctt ctg ttt cat caa aag tga 2334 

He Lys Lys Leu Leu Phe His Gin Lys 

770 775 



<210> 8 

<211> 777 

<212> PRT 

<213> Horoo sapiens 



<220> 

<221> misc_feature 
<222> (535) . . (535) 

<223> The *Xaa' at location 535 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (538) . . (538) 

<223> The 'Xaa' at location 538 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 
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<220> 

<221> misc_feature 
<222> (552) . . (552) 

<223> The 'Xaa' at location 552 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc^feature 
<222> (557) . . (557) 

<223> The 'Xaa' at location 557 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

* 

<220> 

<221> misc_feature 
<222> (602) . . (602) 

<223> The 'Xaa' at location 602 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. r 

• 

<220> 

<221> misc_feature 
<222> (636).. (636) 

<223> The 'Xaa 1 at location 636 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_feature 
<222> (638) . . (638) 

<223> The 'Xaa' at location 638 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (648).. (648) 

<223> The 'Xaa' at location 648 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

V 

♦ 

<220> 

<221> misc^feature 
<222> (660) . . (660) 

<223> The 'Xaa 1 at location 660 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_f eature 

<222> (685) . . (685) 
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<223> The 'Xaa' at location 685 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_feature 
<222> (691) . . (691) 

<223> The 'Xaa 1 at location 691 stands for Cys, Asn f Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

« 

<220> 

<221> misc_feature 
<222> (702).. (702) 

<223> The 'Xaa' at location 702 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

* 

<220> 

* ■ 

<221> misc_feature 
<222> (712) . . (712) 

<223> The 'Xaa' at location 712 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_feature 
<222> (733) . . (733) 

<223> The *Xaa* at location 733 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc^feature 
<222> (741) . . (741) 

<223> The 'Xaa' at location 741 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

* 

<220> 

<221> misc__feature 
<222> (764).. (764) 

<223> The 'Xaa* at location 764 stands for. Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 

<222> (1)..(2334) 

<223> n « a or c or g or t/u 

<400> 8 

Met Asp Ser Lys Glu Ser Leu Thr Pro Gly Arg Glu Glu Asn Pro Ser 
1 5 10 15 
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Ser val Leu Ala Gin Glu Arg Gly Asp Val Met Asp Phe Tyr Lys Thr 
20 25 30 

Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro Ser Leu 
3 5 40 45 

Ala Val Ala Ser Gin Ser Asp Ser Lys Gin Arg Arg Leu Leu Val Asp 
50 55 60 

Phe Pro Lys Gly Ser Val Ser Asn Ala Gin Gin Pro Asp Leu Ser Lys 
65 70 75 80 

Ala Val Ser Leu Ser Met Gly Leu Tyr Met Gly Glu Thr Glu Thr Lys 

85 90 95 

Val Met Gly Asn Asp Leu Gly Phe Pro Gin Gin Gly Gin lie Ser Leu 
100 105 no 

Ser Ser Gly Glu Thr Asp Leu Lys Leu Leu Glu Glu Ser lie Ala Asn 
U5 120 125 



Leu Asn Arg Ser Thr Ser Val Pro Glu Asn Pro Lys Ser Ser Ala Ser 
130 135 140 

Thr Ala Val Ser Ala Ala Pro Thr Glu Lys Glu Phe Pro Lys Thr His 
145 150 i5 5 16Q 

Ser Asp Val Ser Ser Glu Gin Gin His Leu Lys Gly Gin Thr Gly Thr 

165 170 175 

Asn. Gly Gly Asn Val Lys Leu Tyr Thr Thr Asp Gin Ser Thr Phe Asp 
180 185 190 

He Leu Gin Asp Leu Glu Phe Ser Ser Gly Ser Pro Gly Lys Glu Thr 
195 200 205 

Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu He Asp Glu Asn Cys Leu 
210 215 220 



Leu Ser Pro Leu Ala Gly Glu Asp Asp Ser Phe Leu Leu Glu Gly Asn 
225 230 235 240 

Ser Asn Glu Asp Cys Lys Pro Leu lie Leu Pro Asp Thr Lys Pro Lys 

245 250 255 

He Lys Asp Asn Gly Asp Leu Val Leu Ser Ser Pro Ser Asn Val Thr 
260 265 270 

Leu Pro Gin Val Lys Thr Glu Lys Glu Asp Phe He Glu Leu Cys Thr 
275 280 285 

Pro Gly Val lie Lys Gin Glu Lys Leu Gly Thr Val Tyr Cys Gin Ala 
290 295 300 



Ser Phe Pro Gly Ala Asn He He Gly Asn Lys Met Ser Ala He Ser 
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305 310 315 320 



Val His Gly Val Ser Thr Ser Gly Gly Gin Met Tyr His Tyr Asp Met 

325 330 ' 335 



Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin Lys Pro lie Phe Asn 
340 345 350 



val He Pro Pro He Pro Val Gly Ser Glu Asn Trp Asn Arg Cys Gin 
355 360 365 



Gly Ser Gly Asp Asp Asn Leu Thr Ser Leu Gly Thr Leu Asn Phe Pro 
370 375 380 



Gly Arg Thr Val Phe Ser Asn Gly Tyr Ser Ser Pro Ser Met Arg Pro 
385 390 395 400 



Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Thr Ala Thr Thr Gly Pro 

405 410 415 



Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala Ser Gly Cys His 
420 425 430 



Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala 
435 440 445 



Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg Asn Asp Cys He 
450 455 460 



He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys Arg Tyr Arg Lys 
465 470 475 480 



Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys Thr Lys Lys Lys 

485 . 490 " 495 



He Lys Gly He Gin Gin Ala Thr Thr Gly Val Ser Gin Glu Thr Ser 
500 505 510 



Glu Asn Pro Gly Asn Lys Thr He Val Pro Ala Thr Leu Pro Gin Leu 
515 520 525 



Thr Pro Thr Leu Val Ser Xaa Leu Glu Xaa He Glu Pro Glu Val Leu 
530 535 540 



Tyr Ala Gly Tyr Asp Ser Ser Xaa Pro Asp Ser Thr Xaa Arg lie Met 
545 550 555 560 



Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val He Ala Ala Val Lys 

565 570 575 



Trp Ala Lys Ala He Pro Gly Phe Arg Asn Leu His Leu Asp Asp Gin 
580 585 590 



Met Thr Leu Leu Gin Tyr Ser Trp Met Xaa Leu Met Ala Phe Ala Leu 
595 600 605 



Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala 
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620 



Pro Asp Leu He lie Asn Glu Gin Arg Met Thr Xaa Pro Xaa Met Tyr 
625 630 635 640 

Asp Gin Cys Lys His Met Leu Xaa Val Ser Ser Glu Leu His Arg Leu 

645 650 655 

Gin Val Ser Xaa Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu 
660 665 670 

Ser ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu Xaa Phe Asp Glu 
675 680 685 

He Arg Xaa Thr Tyr lie Lys Glu Leu Gly Lys Ala lie Xaa Lys Arg 
690 695 700 

Glu Gly Asn Ser Ser Gin Asn Xaa Gin Arg Phe Tyr Gin Leu Thr Lys 
705 710 715 720 

Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Xaa Asn Tyr Cys 

725 730 735 

Phe Gin Thr Phe Xaa Asp Lys Thr Met Ser He Glu Phe Pro Glu Met 
740 745 750 

Leu Ala Glu He He Thr Asn Gin He Pro Lys Xaa Ser Asn Gly Asn 
755 760 7 6 5 

He Lys Lys Leu Leu Phe His Gin Lys 
770 775 



<210> 9 

<211> 774 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> U)..<774) 
<223> 



<400> 9 

gtt cct gca acg tta cca caa etc acc cct acc ctg gtg tea ctg ttg 

Val Pro Ala Thr Leu Pro .Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

1 5 10 15 

gag gtt att gaa cct gaa gtg tta tat gca gga tat gat age tct gtt 
Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

cca gac tea act tgg agg ate atg act acg etc aac atg tta gga ggg 
Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 . 40 45 



48 



96 



144 
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egg caa gtg att gca gca gtg aaa tgg gca aag gca ata cca ggt ttc 
Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

agg aac tta cac ctg gat gac caa atg acc eta ctg cag tac tec tgg 
Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 "70 75 " 80 

atg ttt ctt atg gca ttt get ctg ggg tgg aga tea tat aga caa tea 
Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 

agt gca aac ctg ctg tgt ttt get cct gat ctg att att aat gag cag 
Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 

aga atg act eta ccc tgc atg tac gac caa tgt aaa cac atg ctg tat 
Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 

gtt tec tct gag tta cac agg ctt cag gta tct tat gaa gag tat etc 
Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

tgt atg aaa acc tta ctg ctt etc tct tea gtt cct aag gac ggt ctg 
Cys Met Lys Thr Leu Leu Leu Leu Ser' Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 

aag age caa gag eta ttt gat gaa att aga atg acc tac ate aaa gag 
Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 * 175 

eta gga aaa gee att gtc aag agg gaa gga aac tec age cag aac tgg 
Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

cag egg ttt tat caa ctg aca aaa etc ttg gat tct atg cat gaa gtg 
Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

gtt gaa aat etc ctt aac tat tgc ttc caa aca ttt ttg gat aag acc 
Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

atg agt att gaa ttc ccc gag atg tta get gaa ate ate acc aat cag 
Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

ata cca aaa tat tea aat gga aat ate aaa aaa ctt ctg ttt cat caa 
He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 

aag tga 
Lys 



<210> 10 

<211> 257 

<212> prt 

<213> Homo sapiens 

<400> 10 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
15 10 15 



Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 * 30 
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Pro Asp ser Thr Trp Arg u e Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val lie Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

70 75 eo 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie lie Asn Glu Gin 
100 105 no 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 12 5 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 

135 i4o 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr lie Lys Glu 

165 170 175 

Leu Gly Lys Ma lie Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

Met Ser lie Glu Phe Pro Glu Met Leu Ala Glu He lie Thr Asn Gin 
225 230 235 240 

J 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 

Lys 



<210> 11 

<211> 1548 

<212> DMA 

<213> Homo sapiens 



<220> 

<221> CDS 
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<222> <1)..<774) 
<223> 



<40O> 11 

gtt cct gca acg tta cca caa etc acc cct acc ctg gtg tea ctg ttg 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

1 5 10 15 

gag gtt att gaa cct gaa gtg tta tat gca gga tat gat age tct gtt 
Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

cca gac tea act tgg agg ate atg act acg etc aac atg tta gga ggg 
Pro Asp Ser Thr Trp Arg lie Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

egg caa gtg att gca gca gtg aaa tgg gca aag gca ata cca ggt ttc 
Arg Gin Val lie Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
SO 55 60 

agg aac tta cac ctg gat gac caa atg acc eta ctg cag tac tec tgg 
Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 B0 

atg tec ctt atg gca ttt get ctg ggg tgg aga tea tat aga caa tea 
Met Ser Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 

agt gca aac ctg ctg tgt ttt get cct gat ctg att att aat gag cag 
Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 HO 

aga atg act eta ccc tgc atg tac gac caa tgt aaa cac atg ctg tat 
Arg Met Thr Leu Pro Cys Met Tyr* Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 

gtt tec tct gag tta cac agg ctt cag gta tct tat gaa gag tat etc 
Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

tgt atg aaa acc tta ctg ctt etc tct tea gtt cct aag gac ggt ctg 
Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 

aag age caa gag eta ttt gat gaa att aga atg acc tac ate aaa gag 
Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 175 

eta gga aaa gee att gtc aag agg gaa gga aac tec age cag aac tgg 
Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

cag egg ttt tat caa ctg aca aaa etc ttg gat tct atg cat gaa gtg 
Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

gtt gaa aat etc ctt aac tat tgc ttc caa aca ttt ttg gat aag acc 
Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 

atg agt att gaa ttc ccc gag atg tta get gaa ate ate acc aat cag 
Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 

ata cca aaa tat tea aat gga aat ate aaa aaa ctt ctg ttt cat caa 
He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 

aag tga gttcctgcaa cgttaccaca actcacccct accctggtgt cactgttgga 
Lys 



ggttattgaa cctgaagtgt tatatgeagg atatgatagc tctgttccag actcaacttg 
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vaggatcatg actacgctca acatgttagg agggcggcaa gtgattgcag cagtgaaatg 
ggcaaaggca ataccaggtt tcaggaactt acacctggat gaccaaatga ccctactgca 
gtactcctgg atgtccctta t gg catttgc tctgg ggtgg agatcatata gacaatcaag 
tgcaaacctg ctgtgttttg ctcctgatct gattattaat gagcagagaa tgactctacc 
ctgcatgtac gaccaatgta aacacatgct gtatgtttcc tctgagttac acaggcttca 
ggtatcttat gaagagtatc tctgtatgaa aaccttactg cttctctctt cagttcctaa 
ggacggtctg aagagccaag agctatttga tgaaattaga atgacctaca tcaaagagct 
aggaaaagcc attgtcaaga gggaaggaaa ctccagccag aactggcagc ggttttatca 
actgacaaaa ctcttggatt ctatgcatga agtggttgaa aatctcctta actattgctt 
ccaaacattt ttggataaga ccatgagtat tgaattcccc gagatgttag ctgaaatcat 
caccaatcag ataccaaaat attcaaatgg aaatatcaaa aaacttctgt ttcatcaaaa 
gtga 

<210> 12 

<211> 257 

<212> prt 

<213> Homo sapiens 

<400> 12 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

10 15 

Glu Val lie Glu Pro Glu Val Leu Tyr Ala G ly Tyr ftsp Ser Ser Val 

25 30 

Pro Asp ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 

40 45 

Arg Gin Val lie Ala Ala Val Lys Trp Ala Lys Ala lie Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

J 75 80 

Met ser Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

90 95 

Ser Ala Asn Leu Leu Cys Phe Ala d™ t -r* 

100 P Ue lle Asn Glu Gln 

105 HO 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 

120 125 

val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 

135 i 4 o 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

155 ** "* 



PCT/US02/22648 



944 
1004 
1064 
1124 
1184 
1244 
1304 
1364 
1424 
1484 
1544 
1548 



160 
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Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 2 30 235 240 



He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 



<210> 13 

<211> 774 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1)..(774) 

<223> 



<400> 13 

gtt cct gca acg tta cca caa etc acc cct acc ctg gtg tea ctg ttg 4 8 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

15 10 15 

gag gtt att gaa cct gaa gtg tta tat gca gga tat gat age tct gtt 96 

Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 

20 25 30 

cca gac tea act tgg agg ate atg act acg etc aac atg tta gga ggg 144 

Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 

35 40 45 

egg caa gtg att gca gca gtg aaa tgg gca aag gca ata cca ggt ttc 192 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 



agg aac tta cac ctg gat gac caa atg acc eta ctg cag tac tec tgg 240 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

65 70 75 80 

atg gac ctt atg gca ttt get ctg ggg tgg aga tea tat aga caa tea 288 

Met Asp Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 " 95 



WO 03/015692 



32/57 



PCT/US02/22648 



190 



215 220 



He? til nl 111 £ £ £* Me'? T *T T atC * at 
225 „o GU Ala Glu Ile rie Thr A sn Gin 

2J0 235 



240 



<210> 14 

<211> 257 

<212> PRT * 

<213> Homo sapiens 



336 



384 



432 



til HI lit HI HI 1% £ £ HI L « «. .tt att aat gag cag 

100 ff? P 1,611 Ile Ile Asn Glu 

105 110 

Me? £ E P~ & £ £ If S- « cac atg ctg tat 
115 yS M6t -T£ As P Gln Lys His Met Leu Tyr 

120 125 

gtt tec tct gag tta cac agg ctt cao ota 

Val Ser Ser Glu Leu His Arg Leu Gin 5 I" 9 " gag tat ctc 

130 6U Gln Val Ser T * r Glu Glu Tyr Leu 

135 140 

tgt atg aaa acc tta ctg ctt ctc tct t« ^ 
Cys Met Lys Thr Leu • Leu * tct tea ,tt cot aag gac g; ctg 

155 160 

= - a a £ s c s s a s s s s is 5M 

170 175 

= « s s 2 a a s c $ - = s - - s « 



480 



a 2? 2 s a a s g a 2 s £ s 2 a a » 

200 205 

S S: £ Leu Leu £ £ £ £ « - «• 9- «. acc 672 
210 5 (c TJlr Phe Leu As P L ys Thr 



720 



ne is £ a s a n s; L a n a i tc r aaa ctt ct * m c ^ 

ber Asn Gly Asn n e Lys Lys Leu Leu Phe His Gln 

45 250 255 

aag tga 

Lys 774 



<400> 14 



Val Pro Ala Thr Leu Pro Gln Leu Thr Pro Thr Leu Val Ser Leu Leu 

=> 10 15 

Glu Va! lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 

25 30 

Pro Asp ser Thr Trp Arg lie Met Thr Thr Leu Asn Met Leu Gly Gly 

Arg Gln Val He Ala Ala Val Lys Trp Ala Lys Ala lie Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gln Met Thr Leu Leu Gln Tyr Ser Trp 

75 80 
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Met Asp Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 



Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150. 155 * * 160 



Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 



He Pro Lys Tyr Ser Asn Gly Asn lie Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 



<210> 15 

<211> 774 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> CDS 

<222> (1)..(774) 



<223> n = a or c or g or t/u such that Xaa at positions 552, 557, 602, 
636, 648, 712, 741, 535, 538, 638, 691, 702, 648, 660, 685, 733 a 
nd 764 can indpendently be Cys, Asn, Tyr, Lys, Ser, Asp, 

Glu, Gin, Arg or Thr. 



<220> 
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<221> misc_feature 
<222> (1)..(774) 



<223> 



I 



<400> 15 

va! Pro Ma* ?£ Leu Pro Gin £° 5^ CCt 3C ° Ct9 «*■ tca "»» tt, 
o «ia Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Xaa Leu 

a = £ c e a a e e e s; e s is e = 

S K E K = 5 2 a £ S S « s - . s 

40 45 

5 = S 2 S E K 5 E E "I E E E g E 

55 60 

5 S 2 E 2 2 5 E »? E E E 2 E 2 E 

0 75 bo 

Me? Xaa Leu Mat !f 9 " " 9 999 t99 a * a tca aga caa tea 
Met Xaa Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 

sir 111 Asn Leu ^ *" 9<=t CCt 9at ct * att a " a « gag cag 
Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He lie Asn Glu Gin 

100 105 110 

Arc Mer ?hr 1°° 7 ^ t3C 93C C3a *«* aaa cac "9 etg nnn 

Arg Met Thr Xaa Pro Xaa Met Tyr Asp Gin Cys Lys His Met Leu Xaa 

120 125 

111 lit t Ct £ 9 " a Ca ° a " Ctt ca 9 9 ta tct nnn gaa gag tat etc 
val ser Ser Glu Leu His Arg Leu Gin Val Ser Xaa Glu Glu Tyr Leu 

ctl Maf f 33 a K° " a Ct9 ctt ctc tct tca 9tt cct aag gac ggt ctg 
Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp lly Leu 

150 155 i 6 o 

Lys Ser G^n Glu v"" ox" 98t 933 a " 393 nnn acc tac a tc aaa gag 
Lys Ser Gin Glu Xaa Phe Asp Glu lie Arg Xaa Thr Tyr lie Lys Glu 

165 no i 75 

Leu lly Lvs 111 II" £2 t" 9 t" "* MC t6C a9C «« aaC *» 

* LyS " a Ile Xaa Rr S Glu Gly Asn Ser Ser Gin Asn Xaa 

185 190 

HI Arg Phe £vr Gin V* ^ Ct ° " 9 9at tCt at ^ cat gtg 
Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 

195 200 205 

val Glu lit ?!° v™ 330 tat tgC ttC caa aca ttfc n ™ g-t aag acc 
Val Glu Asn Leu Xaa Asn Tyr Cys Phe Gin Thr Phe Xaa Asp Lys Thr 

215 220 

Me? til ??! rf 3 lt° CCC 939 at9 tta gct gaa atc atc ^c aat cag 
Met Ser lie Glu Phe Pro Glu Met Leu Ala Glu lie He Thr Asn Gin 

230 235 240 

ill Pro Lv« xT» 993 Sat atC aaa aaa ctt ct * tfct <=aa 

Ile Pro Lys Xaa Ser Asn Gly Asn lie Lys Lys Leu Leu Phe His Gin 

2« 250 255 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 
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aag tga 774 
Lys 



<210> 16 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<222> (15).. (15) 

<223> The 'Xaa* at location 15 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_feature 
<222> (18). .(18) 

<223> The 'Xaa* at location 18 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (32).. (32) 

<223> The 'Xaa' at location 32 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

» 

<220> 

<221> misc_feature 
<222> (37) . , (37) 

<223> The 'Xaa' at location 37 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_feature 
<222> (82).. (82) 

<223> The 'Xaa* at location 82 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

<220> 

<221> misc_feature 
<222> (116) .. (116) 

<223> The 'Xaa' at location 116 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 
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<220> 

<221> misc_feature 
<222> (118).. (U8) 

cll^lT^Tt'r Thr 10 "" 0 " 118 StandS f ° r C * S ' ^ Lys. Ser, Asp, 

* 

<220> 

<221> misc_feature 
<222> (128).. (128) 

llToiTteTt: Thr'° Cati ° n 128 StandS f ° r <* S < As "< Ser, Asp, 

» 

<220> 

<221> misc feature 
<222> (140).. (140) 

llToiT^Tor Thr 10 "" 6 " " 0 StandS f ° r C * S < *»• Lys, Ser, Asp, 

<220> 

<221> misc_f eature 
<222> (165).. (165) 

olToxl h XTt'r rl 10 ^ 165 f ° r ^' Lys, s er , Asp, 

■ 

<220> 

<221> misc_feature 
<222> (171) . . (171) 

S^J'a^ ?hr l0Cati ° n StdndS f ° r CyS ' As "' ^ ^r. Asp, 

* 

<220> 

<221> misc_feature 
<222> (182).. (182) 

Sr«T»3^ Thr! 0 "" 0 " 182 StandS f ° r C ^ S ' Asn ' Ser, Asp, 

<220> 

<221> misc_feature 
<222> (192).. (192) 

ol"\lT^Tor ?hr!° Cati ° n " 2 StandS f ° r CyS ' Asn < T * r < ^r, Asp, 

* 

<220> 

* 

<221> misc^f eature 
<222> (213).. (213) 
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<223> The 'Xaa' at location 213 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

■ 

<220> 

<221> misc_f eature 
<222> (221) . . (221) 

<223> The 'Xaa' at location 221 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

■ 

<220> 

<221> misc_feature 
<222> (244).. (244) 

<223> The 'Xaa' at location 244 stands for Cys, Asn, Tyr, Lys, Ser, Asp, 
Glu, Gin, Arg or Thr. 

• 

<220> 

<221> misc_feature 
<222> (1)..(774) 

<223> n = a or c or g or t/u such that Xaa at positions 552, 557, 602, 
636, 648, 712, 741, 535, 538, 638, 691, 702, 648, 660, 685, 733 a 
nd 764 can indpendently be Cys, Asn, Tyr, Lys, Ser, Asp, Glu, Gin 
, Arg or Thr 

<400> 16 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Xaa Leu 
1 5 10 15 

Glu Xaa He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Xaa 
20 25 30 

Pro Asp Ser Thr Xaa Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 " 80 

Met Xaa Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 HO 

Arg Met Thr Xaa Pro Xaa Met Tyr Asp Gin Cys Lys His Met Leu Xaa 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Xaa Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 . 150 155 160 
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Lys Ser Gin Glu Xaa Phe Asp Glu He Arg Xaa Thr Tyr lie Lys Glu 

165 "0 175 

Leu Gly Lys Ala lie Xaa Lys Arg Glu Gly Asn Ser Ser Gin Asn Xaa 

185 190 

Gin Arg phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

val Glu Asn Leu Xaa Asn Tyr Cys Phe Gin Thr Phe Xaa Asp Lys Thr 

215 220 

Met ser lie Glu Phe Pro Glu Met Leu Ala Glu lie lie Thr Asn Gin 

230 235 240 



PCT/US02/22648 



He Pro Lys Xaa Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 

Lys 



<210> 17 

<211> 25 

<212> PRT 

<213> Homo sapiens 

<400> 17 



Gin Glu Pro Val Ser Pro Lys Lys Lys Glu Asn Ala Leu Leu Arg 



Tvr 

10 15" 



Leu Leu Asp Lys Asp Asp Thr Lys Asp 
20 25 



<210> 18 

<211> 5 

<212> PRT 

<213> Homo sapiens 

<22.0> 

<221> misc_feature 

<222> (1)..(5) 

<223> Xaa is any amino acid 

<400> 18 

Leu Xaa Xaa Leu Leu 
1 5 
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<210> 19 

<211> 67 

<212> DNA 

<213> Homo sapiens 



<400> 19 

cggcggcgcc acatgaaaaa aggtcatcat catcatcatc atggttcccc tatactaggt 60 
tattgga 67 

<210> 20 

<211> 33 

<212> DNA 

<213> Homo sapiens 



<400> 20 

cggcggcgcg gatccacgcg gaaccagatc cga 33 

<210> 21 

<211> 237 

<212> PRT 

<213> Homo sapiens. 

<400> 21 

Met Lys Lys Gly His His His His His His His Gly Ser Pro He Leu 
15 10 15 

Gly Tyr Trp Lys He Lys Gly Leu Val Gin Pro Thr Arg Leu Leu Leu 
20 25 30 

Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu Tyr Glu Arg Asp Glu 
35 40 45 



Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu Gly Leu Glu Phe Pro 
50 55 60 



Asn Leu Pro Tyr Tyr He Asp Gly Asp Val Lys Leu Thr Gin Ser Met 
65 70 75 80 



Ala He He Arg Tyr He Ala Asp Lys His Asn Met Leu Gly Gly Cys 

85 90 95 



Pro Lys Glu Arg Ala Glu He Ser Met Leu Glu Gly Ala Val Leu Asp 
100 105 110 



He Arg Tyr Gly Val Ser Arg He Ala Tyr Ser Lys Asp Phe Glu Thr 
115 120 125 



Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu Met Leu Lys Met Phe 
130 135 140 
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Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn Gly Asp His Val Thr 



150 155 



160 



32 



His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp Val Val Leu Tyr Met 

165 170 x ^ 5 

Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu Val Cys Phe Lys Lys 
180 185 190 

Arg He Glu Ala He Pro Gin lie Asp Lys Tyr Leu Lys Ser Ser Lys 
195 200 205 

Tyr lie Ala Trp Pro Leu Gin Gly Trp Gin Ala Thr Phe Gly Gly Gly 

220 

Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg Gly Ser 

<210> 22 

<211> 32 

<212> DNA 

<213> Homo sapiens 



<400> 22 

tactcctgga tgtcccttat ggcatttgct ct 

<210> 23 

<211> 32 

<212> DNA 

<213> Homo sapiens 



<400> 23 

agagcaaatg ccataaggga catccaggag ta 32 

<210> 24 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<400> 24 

tactcctgga tggaccttat ggcatttgct ct 32 

<210> 25 

<211> 32 

<212> DNA 

<213> Homo sapiens 
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<400> 25 

agagcaaatg ccataaggtc catccaggag ta 32 

<210> 26 

<211> 252 

<212> PRT 

<213> Homo sapiens 

<400> 26 

Ala Leu Thr Pro Ser Pro Val Met Val Leu Glu Asn He Glu Pro Glu 
1 5 10 15 

' * 

lie Val Tyr Ala Gly Tyr Asp Ser Ser Lys Pro Asp Thr Ala Glu Asn 
20 25 30 

Leu Leu Ser Thr Leu Asn Arg Leu Ala Gly Lys Gin Met He Gin Val 
35 40 45 

Val Lys Trp Ala Lys Val Leu Pro Gly Phe Lys Asn Leu Pro Leu Glu 
50 55 60 

Asp Gin He Thr Leu He Gin Tyr Ser Trp Met Cys Leu Ser Ser Phe 
65 70 75 60 

Ala Leu Ser Trp Arg Ser Tyr Lys His Thr Asn Ser Gin Phe Leu Tyr 

85 90 95 

Phe Ala Pro Asp Leu Val Phe Asn Glu Glu Lys Met His Gin Ser Ala 
100 105 " 110 



Met Tyr Glu Leu Cys Gin Gly Met His Gin He Ser Leu Gin Phe Val 
115 120 125 



Arg Leu Gin Leu Thr Phe Glu Glu Tyr Thr lie Met Lys Val Leu Leu 
130 135 140 



Leu Leu Ser Thr He Pro Lys Asp Gly Leu Lys Ser Gin Ala Ala Phe 
145 150 155 160 



Glu Glu Met Arg Thr Asn Tyr He Lys Glu Leu Arg Lys Met Val Thr 

165 170 175 



Lys Cys Pro Asn Asn Ser Gly Gin Ser Trp Gin Arg Phe Tyr Gin Leu 
180 185 190 



Thr Lys Leu Leu Asp Ser Met His Asp Leu Val Ser Asp Leu Leu Glu 
195 200 205 



Phe Cys Phe Tyr Thr Phe Arg Glu Ser His Ala Leu Lys Val Glu Phe 
210 215 220 



Pro Ala Met Leu Val Glu He He Ser Asp Gin Leu Pro Lys Val Glu 
225 230 235 240 
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Ser Gly Asn Ala Lys Pro Leu Tyr Phe His Arg Lys 

245 250 

<210> 27 

<211> 252 

<212> PRT 

<213> Homo sapiens 

<400> 27 

Gin Leu lie Pro Pro L eu lie Asn Leu Leu Met Ser n e Glu Pro Asp 

10 15 

Val He Tyr Ala Gly His Asp Asn Thr Lys Pro Asp Thr Ser Ser Ser 

25 30 

Leu Leu Thr Ser Leu Asn Gin Leu Gly G lu Arg Gin Leu Leu Ser Val 

40 45 

val Lys Trp Ser Lys Ser Leu Pro Gly Phe Arg Asn Leu His He Asp 



60 



Asp Gin lie Thr Leu lie Gin Tyr Ser Trp Met Ser Leu Met Val Phe 



70 75 80 



Gly Leu Gly Trp Arg Ser Tyr Lys His Val Ser Gly Gin Met Leu Tyr 

85 9 ° 95 

Phe Ala Pro Asp Leu lie Leu Asn Glu Gin Arg Met Lys Glu Ser Ser 

105 110 

Phe Tyr Ser Leu Cys Leu Thr Met Trp Gin Ile Pro G i n Giu Pne Val 

Lys Leu Gin val Ser Gin Glu Glu Phe Leu Cys Met Lys Val Leu Leu 

135 140 

Leu Leu Asn Thr He Pro Leu Glu Gly Leu Arg Ser Gin Thr Gin Phe 

155 150 

Glu Glu Met Arg Ser Ser Tyr lie Arg Glu Leu He Lys Ala He Gly 

165 "0 175 

Leu Arg Gin Lys Gly Val Val Ser Ser Ser Gin Arg Phe Tyr Gin Leu 

185 190 

Thr Lys Leu Leu Asp Asn Leu His Asp Leu Val Lys Gin Leu His Leu 

200 



205 



Tyr Cys Leu Asn Thr Phe lie Gin Ser Arg Ala Leu Ser Val Glu Phe 

Pro Glu Met Met Ser Glu Val lie Ala Ala Gin Leu Pro Lys lie Leu 

230 235 240 
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Ala Gly Met Val Lys Pro Leu Leu Phe His Lys Lys 

245 250 



<210> 


28 


<211> 


252 


<212> 


PRT 


<213> 


Homo 


<400> 


28 



Glu Cys Gin Pro lie Phe Leu Asn Val Leu Glu Ala He Glu Pro Gly 
1 5 10 15 



Val Val Cys Ala Gly His Asp Asn Asn Gin Pro Asp Ser Phe Ala Ala 
20 25 30 



Leu Leu Ser Ser Leu Asn Glu Leu Gly Glu Arg Gin Leu Val His Val 
35 40 45 



Val Lys Trp Ala Lys Ala Leu Pro Gly Phe Arg Asn Leu His Val Asp 
50 55 60 



Asp Gin Met Ala Val He Gin Tyr Ser Trp Met Gly Leu Met Val Phe 
65 70 75 80 



Ala Met Gly Trp Arg Ser Phe Thr Asn Val Asn Ser Arg Met Leu Tyr 

85 90 95 



Phe Ala Pro Asp Leu Val Phe Asn Glu Tyr Arg Met His Lys Ser Arg 
100 105 " 110 



Met Tyr Ser Gin Cys Val Arg Met Arg His Leu Ser Gin Glu Phe Gly 
115 120 125 



Trp Leu Gin He Thr Pro Gin Glu Phe Leu Cys Met Lys Ala Leu Leu 
130 135 140 



Leu Phe Ser He He Pro Val Asp Gly Leu Lys Asn Gin Lys Phe Phe 
145 150 155 " 160 



Asp Glu Leu Arg Met Asn Tyr He Lys Glu Leu Asp Arg He He Ala 

165 170 175 



Cys Lys Arg Lys Asn Pro Thr Ser Cys Ser Arg Arg Phe Tyr Gin Leu 
180 185 190 



Thr Lys Leu Leu Asp Ser Val Gin Pro He Ala Arg Glu Leu His Gin 
195 200 205 



Phe Thr Phe Asp Leu Leu He Lys Ser His Met Val Ser Val Asp Phe 
210 215 220 



Pro Glu Met Met Ala Glu He He Ser Val Gin Val Pro Lys He Leu 
225 230 235 * 240 
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Ser Gly Lys Val Lys Pro He Tyr Phe His Thr Gin 



245 250 



<210> 29 

<211> 286 

<212> prt 

<213> Homo sapiens 

<400> 29 



Leu Thr Ala Asp Gin Met Val Ser Ala Leu Leu Asp Ala Glu 



10 15 



Pro Pro 



He Leu Tyr Ser Glu Tyr Asp Pro Thr Arg Pro Phe Ser Glu Ala Ser 

Met Met Gly Leu Leu Thr Asn Leu Ala Asp Arg Glu Leu Val His Met 

40 45 

He Asn Tr P Ala Lys Arg Val Pro Gly Phe Val Asp Leu Thr Leu His 

55 60 



PCT/US02/22648 



Asp Gin Val His Leu Leu Glu Cys Ala Trn r fll , n« T^ r 
65 -, n y * ±d lr P Leu G1 u He Leu Met He 



70 « 80 

Gly Leu Val Trp Arg Ser Met Glu His Pro Gly Lys Leu Leu Phe Ala 

85 9° 95 

Pro Asn Leu Leu Leu Asp Arg Asn Gin Gly Lys Cys Val Glu Gly Met 

105 no 

Val Glu lie Phe Asp Met Leu Leu Ala Thr Ser Ser Arg Phe Arg Met 

1 2 5 

Met Asn Leu Gin Gly Glu Glu Phe Val Cys Leu Lys Ser He Xle Leu 

1 J J 



140 



Leu Asn Ser Gly Val Tyr Thr Phe Leu Ser Ser Thr Leu Lys Ser Leu 

150 155 160 



Glu Glu Lys Asp His lie His Arg Val Leu Asp Lys lie Thr Asp 

165 "0 17S 



Thr 



Leu He His Leu Met Ala Lys Ala Gly Leu Thr Leu Gin Gin Gin His 



190 



Gin Arg Leu Ala Gin Leu Leu Leu He Leu Ser His lie Arg His Met 



205 



Ser Asn Lys Gly Met Glu His Leu Tyr Ser Met Lys Cys Lys Asn Val 

215 220 

4 

val Pro Leu Tyr Asp Leu Leu Leu Glu Met Leu Asp Ala His Arg Leu 

230 235 240 
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His Ala Pro Thr Ser Arg Gly Gly Ala Ser Val Glu Glu Thr Asp Gin 

245 250 255 



Ser His Leu Ala Thr Ala Gly Ser Thr Ser Ser His Ser Leu Gin Lys 
260 265 270 



Tyr Tyr lie Thr Gly Glu Ala Glu Gly Phe Pro Ala Thr Val 

280 285 





275 


<210> 


30 


<211> 


268 


<212> 


PRT 


<213> 


Homo 


<400> 


30 



Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro Pro 
15 10 15 



His Val Leu He Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser Met 
20 25 30 



Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met He 
35 40 45 



Ser Trp Ala Lys Lys He Pro Gly Phe Val Glu Leu Ser Leu Phe Asp 
50 55 60 



Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met Gly 
65 70 75 80 



Leu Met Trp Arg Ser He Asp His Pro Gly Lys Leu He Phe Ala Pro 

85 90 95 



Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly He Leu 
100 105 110 



Glu He Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu Leu 
115 120 125 



Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met lie Leu Leu 
130 135 140 



Asn Ser Ser Met Tyr Pro Leu Val Thr Ala Thr Gin Asp Ala Asp Ser 
145 150 155 160 



Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu Val 

165 170 175 



Trp Val He Ala Lys Ser Gly He Ser Ser Gin Gin Gin Ser Met Arg 
180 185 190 



Leu Ala Asn Leu Leu Met Leu Leu Ser His Val Arg His Ala Ser Asn 
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195 



200 



205 



tys Gly Met Glu His Leu Leu Asn Met Lys Cys Lys Asn val Val Pro 
~ w 215 220 

Val Tyr Asp Leu Leu Leu Glu Met Leu Asn Ala His Val Leu Arg Gly 

235 240 

Cys. Lys Set Ser lie Thr Gly Ser Glu Cys Ser Pro Ala Glu Asp Ser 

250 255 

Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin Ser Gin 
260 265 

<210> 31 

<211> 251 

<212> prt 

<213> Homo sapiens 



{ 



<400> 31 

Gin Leu Thr Pro Thr Leu Val Ser Leu Leu Glu Val il e 

5 10 



Glu Pro Glu 
15 



val Leu Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Thr Trp Arg 

25 30 

He Met Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val lie Ala Ala 

40 45 

val Lys Trp Ala Lys Ala lie Pro Gly Phe Arg Asn Leu His Leu Asp 

55 60 



Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp Met Ser 



70 



75 



Leu Met Ala Phe 
80 



Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser Ser Ala Asn Leu Leu Cys 

85 90 95 y 



Phe Ala Pro Asp Leu lie lie Asn Glu Gin Arg Met Thr Leu Pro Cys 

105 110 



Met Tyr Asp Gin Cys Lys His Met Leu Tyr Val Ser Ser Glu Leu 
15 12 ° 125 



His 



Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys Thr Leu Leu 

135 140 



Leu Leu Ser Ser Val Pro Lys Asp Gly Leu Lys Ser Gin Glu 

150 155 



Leu Phe 
160 



Asp Glu lie Arg Met Thr Tyr lie Lys Glu Leu Gly Lys Ala lie Val 

165 1™ 175 



Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe Tyr 



Gin Leu 
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180 185 190 



Thr Lys Leu Leu Asp Ser Met His Glu Val Val Glu Asn Leu Leu Asn 
195 200 205 



Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He Glu Phe Pro. 
210 215 220 



Glu Met Leu Ala Glu He He Thr Asn Gin He Pro Lys Tyr Ser Asn 
225 230 235 240 



Gly Asn He Lys Lys Leu Leu Phe His Gin Lys 

245 250 



<210> 32 

<211> 259 

<212> PRT 

<213> Homo sapiens 

<400> 32 

Gly Ser Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser 
15 10 15 



Leu Leu Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser 
20 25 30 



Ser Val Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu 
35 40 45 



Gly Gly Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro 
50 55 60 



Gly Phe Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr 
65 70 75 80 



Ser Trp Met Ser Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg 

85 90 " -95 



Gin Ser Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn 
100 105 110 



Glu Gin Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met 
115 120 125 



Leu Tyr Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu 
130 135 140 



Tyr Leu Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp 
145 150 155 * 160 



Gly Leu Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He 

165 170 175 



Lys Glu Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin 
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180 "5 190 

Asn rrp Gin Arc Phe Tyr cm Leu Thr Lys Leu Leu Asp Ser Met His 

200 205 

Glu Val val Glu Asn Leu Leu Asn T yr Cys Phe Gin Thr Phe Leu Asp 



220 



Lys Thr Met Ser lie Glu P he Pr0 Glu Met Leu flla ^ n- n# ^ 

235 240 



225 230 



Asn Gin lie Pro Lys Tvr Ser Asn n« Ben ti 

ys iyr ber Asn Gly Asn He Lys Lys Leu Leu Phe 



245 250 255 



His Gin Lys 



<210> 33 

<211> 257 

<212> PRT 

<213> Homo sapiens 



<400> 33 



val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser 



10 15 



Leu Leu 



Glu val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 

25 30 

Pro Asp ser Thr Arg Arg u e Met Thr Thr Leu Asn Met Leu Gly Gly 
J5 40 45 

Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala lie Pro Gly Phe 

55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

70 75 80 

Met Phe Leu Met Ma Phe Ala Leu Gly Trp. Arg Ser Tyr Arg Gin Ser 

85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie lie Asn Glu Gin 



105 



110 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 



120 



125 



val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 

135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 160 
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Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 * 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 



lie Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 

<210> 34 

<2H> 257 

<212> PRT 

<213> Homo sapiens 

<400> 34 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
15 10 15 



Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 '25 30 



Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 



Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 



Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 



Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Leu Ser 

85 90 ^95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 



Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 
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cys Met Lys Thr Leu , Leu Leu ser ser ^ ^ ftsp g ^ 

155 160 
Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr lie Lys Glu 



175 



Leu Gly Lys lie Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 



185 190 



Cln Arg Phe Tyr Gln Leu Thr ^ ^ ^ ^ ^ ^ ^ ^ 



200 205 



Val Glu Asn Leu Leu Asn Tvr rv«? dk d r-i «\ 

210 I?' yS Phe Gln Thr Phe Leu Asp Lys Thr 

215 220 

Met ser He Glu Phe Pro Glu Met Leu Ala Glu lie He Thr Asn Gln 

235 240 



He Pro Lys Tyr Ser Asn Gly Asn u e Lys Lys Leu Leu Phe 



250 



His Gln 
255 



Lys 



<210> 35 

<211> 257 

<212> prt 

<213> Homo sapiens 

<400> 35 



Val Pro Ala Thr Leu Pro Gln Leu Thr Pro Thr Leu Val 



Ser Leu Leu 



10 15 



Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 

25 30 

Pro Asp ser Thr Trp Arg u e Met Thr Thr Leu Asn Met Leu -Gly Gly 



4° 45 



Arg Gln Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 

55 go 

Arg Asn Leu His Leu Asp Asp Gln Met Thr Leu Leu Gln Tyr Ser Trp 

° ?5 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg His Ser 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gln 

105 110 

Arg Met Thr Leu Pro Cys Met Tyr Asp Gln Cys Lys His Met Leu Tyr 
115 120 125 
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Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 160 



Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 4 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 



He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 



<210> 36 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 36 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1 5 10 15 



Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 * 30 



Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 



Arg Gin Val He Ala Thr Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 



Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 



Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 



Ser Ala Asn Leu Leu Cys . Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 
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Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
US 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 i 6 o 

Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr lie Lys Glu 

16 5 170 



Leu Gly Lys Ala lie Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 



185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 2 20 

Met Ser lie Glu Phe Pro Glu Met Leu Ala Glu lie He Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn lie Lys Lys Leu Leu Phe His Gin 

245 250 . 255 

Lys 



<210> 37 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 37 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
15 10 15 

Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 

Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val lie Ala Ala Val Lys Trp Ala Lys Ala lie Pro Gly Phe 
50 55 60 

Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Tro 
65 ™ 75 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 
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Ser Ala Asn Met Leu Cys Phe Ala Pro Asp Leu lie He Asn Glu Gin 
100 105 * 110 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 . 125 



Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 " & 160 



Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 * 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
210 215 220 



Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 



He Pro Lys Tyr Ser Asn Gly Asn "lie Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 



<210> 38 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 38 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1.5 10 15 



Glu Val He Glu Pro Glu Val. Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 



Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 



Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Thr He Pro Gly Phe 
50 55 60 



Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 
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Met Leu Leu Met Ala Phe Ala. Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

95 



85 90 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie lie Asn Glu Gin 
100 105 no 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 

120 125 

Val Ser Ser Glu Leu His Arg Leu Gin val Ser Tyr Glu Glu Tyr Leu 

135 140 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

155 160 



Lys ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr He Lys Glu 

"5 170 ^5 

Leu Gly Lys Ala lie val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 ">5 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
Met Ser lie Glu Phe Pro Glu Met Leu Ala Glu lie lie Thr Asn Gin 



He Pro Lys Tyr Ser Asn Gly Asn lie Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 




<210> 


39 


<211> 


257 


<212> 


PRT 


<213> 


Homo 


<400> 


39 



val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

5 10 15 

Glu Val lie Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 3 0 

Pro Asp ser Thr Trp Arg n e Met Thr Thr Phe Asn Met Leu Gly Gly 
35 40 45 

Arg Gin Val lie Ala Ala Val Lys Trp Ala Lys Ala lie Pro Cys Phe 
3U 55 60 



( 



235 240 . / 
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Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 



Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 * 95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 HO 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 - 125 



val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 
145 150 155 160 



Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu 


Asn 


210 




Met Ser 


He 


225 




He Pro 


Lys 


Lys 




<210> 


40 


<211> 


257 


<212> 


PRT 


<213> 


Homo 



215 220 



230 235 240 



245 250 255 



<400> 40 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 
1 5 10 15 



Glu Val He Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 



Pro Asp Ser Thr Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly 
35 40 45 
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Arg Gin Val lie Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly 
50 55 60 y 



Phe 



Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 

70 "?5 80 

Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 95 

Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu lie lie Asn Glu Gin 
100 105 110 

Arg Met Thr Leu Pro Cys Met Tyr A sp Gin Cys Lys His Met Leu Tyr 
115 120 125 

Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr His 

Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 160 

Lys Ser Gin Glu Leu Phe Asp Glu lie Arg Met Thr Tyr lie Lys Glu 

165 170 175 

Leu Gly Lys Ala lie Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 

Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 

Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr 
u 215 220 

Met Ser He Glu Phe Pro Glu Thr Leu Ala Glu He lie Thr Asn Gin 
225 230 235 240 

He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 

Lys 



<210> 41 

<211> 257 

<212> PRT 

<213> Homo sapiens 

<400> 41 

Val Pro Ala Thr Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu 

5 . 10 15 

Glu Val lie Glu Pro Glu. Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val 
20 25 30 
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Pro Asp Ser Thr Trp Arg He Met Thr Thr Phe Asn Met Leu Gly Gly 
35 40 45 



Arg Gin Val He Ala Ala Val Lys Trp Ala Lys Ala He Pro Gly Phe 
50 55 60 



Arg Asn Leu His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp 
65 70 75 80 



Met Phe Leu Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser 

85 90 . " 95 



Ser Ala Asn Leu Leu Cys Phe Ala Pro Asp Leu He He Asn Glu Gin 
100 105 110 



Arg Met Thr Leu Pro Cys Met Tyr Asp Gin Cys Lys His Met Leu Tyr 
115 120 125 



Val Ser Ser Glu Leu His Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu 
130 135 140 



Cys Met Lys Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Asp Gly Leu 

150 155 160 



Lys Ser Gin Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu 

165 170 175 



Leu Gly Lys Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp 
180 185 190 



Gin Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val 
195 200 205 



Val Glu Asn Leu Leu Asn Tyr Cys Phe Gin Thr Phe Leu Asp Lys Asn 
210 215 220 



Met Ser He Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin 
225 230 235 240 



He Pro Lys Tyr Ser Asn Gly Asn He Lys Lys Leu Leu Phe His Gin 

245 250 255 



Lys 
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